Air Quality Monitoring and Advanced Bayesian Modeling 0323902669, 9780323902663

Air Quality Monitoring and Advanced Bayesian Modeling introduces recent developments in urban air quality monitoring and

319 58 18MB

English Pages 316 [317] Year 2023

Table of contents :
Air Quality Monitoring and Advanced Bayesian Modeling
Copyright
Introduction
Clean versus polluted air
Sources and impacts of air pollutants
Air quality monitoring strategies
Modeling and forecasting of air pollution
About this book
References
Current air quality monitoring methods
Methods for criteria air pollutants
Carbon monoxide (CO)
Sulfur dioxide (SO2)
Nitrogen oxides (NO and NO2)
Ozone (O3)
Particulate matters (PM10 and PM2.5)
Real-time chemical composition monitoring
Particulate matters
Mass spectrometry for real-time PM measurement
Mass spectrometry based on electron impact (EI)
Mass spectrometry based on laser ionization desorption (LDI)
Ion chromatography for real-time PM measurement
Ion chromatographic systems for particles only
Ion chromatographic systems for gases and particles
Real-time measurement of trace elements in PM
Volatile organic compounds
Gas chromatography for real-time VOC measurement
Mass spectrometry for real-time VOC measurement
Other real-time techniques
Optical techniques for real-time measurements of gases
Thermal and optical techniques for real-time measurements of PM
Conclusions
References
Emerging air quality monitoring methods
Low-cost sensors
Electrochemical sensors
Metal oxide sensors
Optical sensors for PM
Sensors for VOCs
New considerations for low-cost sensors
Analytical merits
Potential interferences
Lab calibrations and field comparisons
Data correction
Data transmission and sensor networks
Mobile measurement platforms
On-road air quality monitoring
Powered and nonfixed-route vehicles
Powered and fixed-route vehicles
Nonpowered and nonfixed-route platforms
Requirements on monitoring method and data analysis
Air-borne air quality monitoring
Balloon-borne measurements
Manned-aircraft measurements
Unmanned-aircraft measurements
Other mobile measurement platforms
Conclusions
References
Traditional statistical air quality forecasting methods
Multiple linear regression (MLR)
Overview
Basics of multiple linear regression
Ridge regression and LASSO
Example: Estimation of AR(2) parameters with the multiple linear regression, the ridge regression, and the LASSO r ...
Classification and regression tree (CART)
Overview
Regression tree
Classification tree
Bagging and random forests
Example: Estimation of CO2 emissions from vehicle features with random forest
Multilayer perceptron
Overview
Basics of multilayer perceptron
Training algorithm of MLP
Example: Imputation of missing air quality data based on multilayer perceptron
Support vector regression (SVR)
Overview
Formulation of support vector regression
Case study
Overview
Prediction of PM2.5 and ground-level O3 concentrations of Macau
References
Advanced Bayesian air quality forecasting methods
Overview of technique limitations and advanced topics for improvement
Choice of model complexity
Necessity of model adaptiveness
Bayesian model class selection of linear regression model
Overview
Basics of Bayesian model class selection in linear regression model
Modeling of Keeling curve
Kalman filter-based adaptive air quality model
Overview
Basics of Kalman filter-based adaptive air quality model
Selection of perturbation matrix and measurement noise variance
Revisiting example 5.2.3 (modeling of Keeling curve) with the adaptive linear model
Time-varying multilayer perceptron
Overview
Basics of time-varying multilayer perceptron
Example: Prediction of Mackey-Glass time series by using the TVMLP model
Adaptive Bayesian model averaging of multiple time-varying regression models
Overview
Basics of dynamic Bayesian model averaging
Modeling of measured PM2.5 concentration of the low-cost sensor
Case study
Overview
Air quality forecasting in Macau with the adaptive linear models
References
Index

Recommend Papers

Urban Air Quality Monitoring, Modelling and Human Exposure Assessment [1st ed.] 9789811555107, 9789811555114

This contributed volume is primarily intended for graduate and professional audiences. The book provides a basic underst

406 65 17MB Read more

Internet of Things for Indoor Air Quality Monitoring (SpringerBriefs in Applied Sciences and Technology) 303082215X, 9783030822156

This book provides a synthesis for using IoT for indoor air quality assessment. It will help upcoming researchers to und

108 25 2MB Read more

Wastewater Quality Monitoring and Treatment 9780471499299, 0471499293

The issue of water quality monitoring is becoming a huge area as the EU requirements for cleaner water increase. On-line

508 52 3MB Read more

Probability and Bayesian Modeling [1st Edition] 9781138492561

Probability and Bayesian Modeling is an introduction to probability and Bayesian thinking for undergraduate students wit

716 72 5MB Read more

Current Air Quality Issues

Air pollution is thus far one of the key environmental issues in urban areas. Comprehensive air quality plans are requir

738 87 21MB Read more

Bayesian Network Modeling of Corrosion 9783031561276, 9783031561283

This book represents a compilation of experience from a slate of experts involved in developing and deploying Bayesian N

116 109 Read more

Bayesian Structural Equation Modeling 9781462547746, 1462547745

"This book is meant as a guide for implementing Bayesian methods for latent variable models. I have included thorou

311 16 12MB Read more

Bayesian modeling in bioinformatics 1420070177, 9781420070170

Bayesian Modeling in Bioinformatics discusses the development and application of Bayesian statistical methods for the an

379 72 13MB Read more

Energy and Buildings: Efficiency, Air Quality, and Conservation : Efficiency, Air Quality, and Conservation [1 ed.] 9781617283994, 9781607410492

The authority of the United States Senate (as well as of the House) to establish the rules for its own proceedings, to &

152 24 17MB Read more

Bayesian Statistical Modeling with Stan, R, and Python 9789811947544, 9789811947551

658 70 10MB Read more

Air Quality Monitoring and Advanced Bayesian Modeling
0323902669, 9780323902663

Author / Uploaded
Yongjie Li
Ka In Hoi
Kai Meng Mok
Ka Veng Yuen

Similar Topics
Technique

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

AIR QUALITY MONITORING AND ADVANCED BAYESIAN MODELING

AIR QUALITY MONITORING AND ADVANCED BAYESIAN MODELING YONGJIE LI Department of Civil and Environmental Engineering, University of Macau, Avenida da Universidade, Taipa, Macau

KA IN HOI Department of Civil and Environmental Engineering, University of Macau, Avenida da Universidade, Taipa, Macau

KAI MENG MOK Department of Civil and Environmental Engineering, University of Macau, Avenida da Universidade, Taipa, Macau

KA VENG YUEN State Key Laboratory on Internet of Things for Smart City, Department of Civil and Environmental Engineering, University of Macau, Avenida da Universidade, Taipa, Macau

Elsevier Radarweg 29, PO Box 211, 1000 AE Amsterdam, Netherlands The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom 50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States Copyright © 2023 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. ISBN: 978-0-323-90266-3 For information on all Elsevier publications visit our website at https://www.elsevier.com/books-and-journals

Publisher: Candice G Janco Acquisitions Editor: Jennette McClain Editorial Project Manager: Naomi Robertson Production Project Manager: Rashmi Manoharan Cover Designer: Vicky Pearson Typeset by STRAIVE, India

CHAPTER 1

Introduction Contents 1.1. Clean versus polluted air 1.2. Sources and impacts of air pollutants 1.3. Air quality monitoring strategies 1.4. Modeling and forecasting of air pollution 1.5. About this book References

1 3 6 8 10 11

When we say a volume of air is “polluted,” we mean that it contains unwanted substances that might affect the species living in that volume of air or cause some other effects to the ecosystem. As the key information conveyed to the public about the quality of air, an array of indexes is used to describe the levels of pollutants in the volume of air that we concern about. The indexes, normally called air quality index (AQI), are derived from numerical expressions scalable to the pollutant concentrations. For instance, numeric index from 0 to 100 (or 0 to 10 in some countries) is commonly used to indicate the AQI, with 0 being excellent, though not achievable, and 100 being bad. There are, however, situations where AQI values can reach 500 or more, which is normally labeled as “severe” or “hazardous.” The AQI values are normally determined from the hourly concentration levels of several criteria pollutants automatically measured by the environmental protection agencies. Therefore, before one can derive these indexes, the concentrations of the air pollutants have to be measured first; if one wants to know what will be the anticipated AQI tomorrow, then forecasting models have to be invoked to predict the concentrations of the pollutants. These two topics are the focus of this book. In addition, of course, for regulatory and research purposes, much more detailed information on air pollutants is needed than just those covered by AQI; such information is also obtained from different measurement methods, which will also be covered by this book.

1.1 Clean versus polluted air The question of how polluted the volume of air is, however, requires quantitative assessment of the concentration levels of that particular pollutant. Before that, perhaps some even more fundamental questions have to be answered first: – What is the unpolluted “clean” air supposed to contain? Air Quality Monitoring and Advanced Bayesian Modeling https://doi.org/10.1016/B978-0-323-90266-3.00004-2

Copyright © 2023 Elsevier Inc. All rights reserved.

1

2

Air quality monitoring and advanced bayesian modeling

– Has the composition of “clean” air been the same in the geologic time since the formation of our atmosphere? – What substances can be considered pollutants and need to be measured? The first atmosphere (c. 4.6 billion years ago) of the Earth is believed to consist of mainly hydrogen (H2) and helium (He), which were swept away by solar wind or lost due to gravitation escape ( Jacobson, 2002). Then the outgassing of carbon dioxide (CO2), water vapor, and assorted gases from the Earth’s mantle formed the second atmosphere, which is referred to prebiotic atmosphere. The biotic atmosphere, emerged after the appearance of living organisms (c. 3.5 billion years ago), mainly consisted of methane (CH4), molecular nitrogen (N2), sulfur dioxide (SO2), and CO2. After living organisms capable of photosynthesis appeared (c. 2.3 billion years ago), oxygen and ozone were produced; the oxygen buildup resulted in aerobic respiration, which led to more efficient production of N2, making it the major component in the atmosphere today. Today, the dry air is mainly composed of N2 (78.1% by volume) and O2 (20.9% by volume). The other relatively abundant gas is argon (Ar, 0.9% by volume), which is a noble (and extremely nonreactive) gas. Other nonreactive gases in clean air include neon (Ne), He, krypton (Kr), H2, and xenon (Xe), which have volume fractions of (1–200) 10 7. All these are considered nonreactive and relatively invariable constituents of clean gases, thus are not considered pollutants. Another gas that is relatively nonreactive (with a long atmospheric lifetime of about 15 years) and invariable gas, CO2 (currently with a volume fraction of around 0.04%, or 400 parts per million, ppm), is an important greenhouse gas that causes global warming; it is thus considered as an air pollutant from anthropogenic activities (e.g., burning of fossil fuels). This “invariable” characteristic of CO2 is, however, a relative description. Its concentration did increase from about 280 ppm before the Industrial Revolution to over 400 ppm nowadays due to anthropogenic input; it shows a clear annual pattern with lows in summer (efficient uptake by plant growth) and highs in winter (release from decays of plant litters). Water vapor is also an important constituent in the atmosphere and it is highly variable, with volume fractions ranging from a few parts per million all the way to approximately 4%. Water vapor is generally not considered as an air pollutant because it occurs mostly naturally, but its presence and the dynamics of it greatly affect the hydrological cycle and the meteorological phenomena. Most air pollutants we concern about are in trace amounts and their concentrations are highly variable ( Jacobson, 2002). For instance, carbon monoxide (CO) has a volume mixing ratio of 0.04–0.2 ppm in clean environments, but it can reach 2–10 ppm in polluted environments. The mixing ratio of surface ozone can be as low as 10 ppb (parts per billion) in clean environments, but as high as 350 ppb in polluted regions. Sulfur dioxide (SO2) mixing ratios are normally less than 1 ppb in clean areas but can be as high as 30 ppb in polluted areas. Nitrogen oxides (mainly nitric oxide, NO, and nitrogen dioxide, NO2) have mixing ratios of 10 ppb. Particulate matters (PM) are suspended liquid or solid particles (Seinfeld and Pandis, 2016) that have sufficiently long lifetimes in the atmosphere to exert various effects on the environment. PM are normally classified by a cut-point diameter, with PM10 denoting particles with aerodynamic diameters < 10 μm and PM2.5 and PM1 of 0 and ui ¼ 0 when zi is below the lower bound yi E. Both ui and li are equal to zero when the training data is on or within the ε-insensitive tube. The optimal estimate of the parameters can be found by minimizing the objective function E(w, b) subjective to the constraints in Eqs. (4.95) and (4.96). However, this minimization problem involves many inequality constraints. Instead of performing this tedious constrained optimization, it is more convenient to perform maximization of its Lagrangian dual function subjected to some easy constraints. The final results will be the same for convex optimization. We can show the objective function in Eq. (4.97) is convex as the Hessian of E(w, b) is positive semidefinite for any w and b. To construct the Lagrangian dual function, the constraints of Eq. (4.97) are incorporated into the following Lagrangian: N N N X X X 1 ðui + l i Þ + kwk2 β i ð E + u i zi + y i Þ γ i ðE + l i + zi yi Þ 2 i¼1 i¼1 i¼1 N X ðλi ui + δi l i Þ

L¼C

i¼1

(4.98) where βi, γ i, λi, and δi are nonnegative Lagrange multipliers for the inequality constraints. The Lagrangian dual function is obtained from the infimum of the Lagrangian over w, b, ui, and li: gðβ1 , …, βN , γ 1 , …, γ N Þ ¼ Inf L ðw, b, u1 , …, uN , l 1 , …, l N , β1 , …, βN , γ 1 , …, γ N Þ w,b,ui ,li

(4.99) The infimum of L over w, b, ui, and li can be found by setting their first-order derivatives to zero. Before taking the derivative, the prediction yi in Eq. (4.98) is replaced by the expression of the SVR model shown in Eq. (4.91). The first-order derivative of L with respect to w is given by: N N X X ∂L βi φðxi Þ + γ i φðxi Þ ¼ w2 ∂w i¼1 i¼1

(4.100)

Setting Eq. (4.100) to zero gives: w b¼

N X

ðβi γ i Þφðxi Þ

i¼1

The first-order derivative of L with respect to b is given by:

(4.101)

Traditional statistical air quality forecasting methods

N N X X ∂L βi + γi ¼2 ∂b i¼1 i¼1

(4.102)

Setting Eq. (4.102) to zero gives: N X

ðβi γ i Þ ¼ 0

(4.103)

i¼1

Other conditions governing the Lagrange multipliers are obtained similarly by setting the first-order derivatives of L with respect to ui and li to zero: β i + λi ¼ C

(4.104)

γ i + δi ¼ C

(4.105)

The Lagrangian dual function g is obtained after substituting Eq. (4.101) and Eqs. (4.103)–(4.105) into Eq. (4.98): N X

gðβ1 , …, βN , γ 1 , …, γ N Þ ¼ C

ðui + li Þ ! ! N N X X ðβi γ i ÞφT ðxi Þ βj γ j φ xj

i¼1

1 + 2

N X

+

i¼1

j¼1

N N X X ðβi + γ i ÞE ðβi + λi Þui ðγ i + δi Þli

i¼1 N X

i¼1

i¼1

ðβi γ i Þzi

i¼1 N X

!

ðβi γ i Þφ ðxi Þ

b

T

i¼1 N X

! N X βj γ j φ xj j¼1

ðβ i γ i Þ

i¼1

(4.106) After further elaboration, the final form of the Lagrangian dual function g is shown as follows: gðβ1 , …, βN , γ 1 , …, γ N Þ ¼

N X i¼1

ðβi + γ i ÞE +

N X ðβi γ i Þzi i¼1

N N 1XX ð β i γ i Þ β j γ j K xi , xj 2 i¼1 j¼1

(4.107)

215

216

Air quality monitoring and advanced bayesian modeling

where K(xi, xj) ¼ φT(xi)φ(xj) denotes the Kernel function evaluated at xi and xj. The Lagrange multipliers βi and γ i are called dual variables in this function and they are subjected to constraints βi 0, γ i 0. Apart from that, upper limits on the dual variables are imposed by Eqs. (4.104) and (4.105). As λi and δi are both Lagrange multipliers which are nonnegative, βi and γ i must be less than or equal to C. Therefore the optimal estimates of the βi and γ i are found by solving the Lagrangian dual optimization problem later with updated constraints: max

β1 ,…,βN ,γ 1 ,…,γ N

gðβ1 , …, βN , γ 1 , …, γ N Þ

(4.108)

subject to 0 βi C,0 γ i C,i ¼ 1,…,N This optimization problem can be solved efficiently by using the sequential minimal optimization (SMO) in Matlab. The SMO algorithm was originally proposed by John Platt in 1998 to speed up the optimization of the dual problem in SVM for classification (Platt, 1998). Later the SMO framework of Platt was extended to the SVR model for regression (Shevade et al., 2000; Flake and Lawrence, 2002; Smola and Sch€ olkopf, 2004). Further insights of the dual variables can be gained by using the complementary slackness in the Karush-Kuhn-Tucker (KKT) conditions (Boyd and Vandenberghe, 2004). It states that the product of the dual variable and the corresponding constraint is equal to zero. Previously, we only have two conditions governing the Lagrange multipliers in Eqs. (4.104) and (4.105). By using these two conditions together with the complementary slackness, the relationships between the dual variables and the slack variables are given by: βi ðE + ui zi + yi Þ ¼ 0

(4.109)

γ i ðE + l i + zi yi Þ ¼ 0

(4.110)

ðC βi Þui ¼ 0

(4.111)

ðC γ i Þl i ¼ 0

(4.112)

When the constraint E + ui zi + yi becomes zero, it corresponds to the data points on or above the upper boundary of the ε-insensitive tube (ui 0). According to Eq. (4.109), each of these data points has positive value of βi > 0. When the constraint E + li + zi yi becomes zero, it corresponds to the data points on or below the lower boundary of the εinsensitive tube (li 0). Each of these data points has positive value of γ i > 0 according to Eq. (4.110). As a data point can only stay at one side of the ε-insensitive tube, either one of the constraints can become active. Therefore βi and γ i cannot be both positive at the same time and either of them must be zero. For points within the tube, both constraints E + ui zi + yi and E + li + zi yi are greater than zero. Hence, βi and γ i are both equal to zero (Bishop, 2006).

Traditional statistical air quality forecasting methods

Now we can put together the results to update the support vector regression model in Eq. (4.91). By substituting the optimal weight vector w b and the dual optimal b β1 ,…,b βN , bγ 1 ,…,bγ N into Eq. (4.91), an updated expression for the SVR model is given by: yðxÞ ¼

N X b βi b γ i φT ðxi ÞφðxÞ + b i¼1

¼

N X b βi b γ i K ðxi , xÞ + b

(4.113)

i¼1

As βi γ i is equal to zero for points inside the ε-insensitive tube, the weighted sum of the Kernel functions is only contributed by the points on or outside the boundary and these points are called the support vectors. The bias b is still unknown in this equation but it can be found by utilizing one of the support vectors on the upper or lower boundary. Suppose a support vector on the upper boundary (xk, yk) is utilized. For this data point, uk ¼ 0 and yk ¼ zk E. Therefore b can be estimated by the following expression: bb ¼ zk E

N X b βi b γ i K ðxi , xk Þ

(4.114)

i¼1

By substituting Eq. (4.114) into Eq. (4.113), the final form of the SVR model is given by: yðxÞ ¼

Ns X b βi b γ i ½K ðxi , xÞ K ðxi , xk Þ + zk E

(4.115)

i¼1

where Ns represents the number of support vectors in the training data. As the choice of different kernels can affect the performance of SVR model, Table 4.6 presents the kernels commonly used in previous air quality studies for reference. The radial basis function is the most popular kernel among these studies. In this kernel function, kxi 2 xk2 represents the squared Euclidean distance between the support vector and the new input x. For a given kxi 2 xk2, the contribution to the prediction y from the support vector is reflected by K(xi, x). The kernel value decreases from 1 to 0 as the squared distance increases from zero to infinity. The hyperparameter m in the RBF kernel is equal to 1/2σ 2 and it controls how rapid K(xi, x) decreases with distance from the support vector xi. Apart from the kernel function, the performance of the SVR model is also determined by the radius of the E-insensitive tube and the regularization constant C. When the choice of ε is too small, almost all training data points become the support vectors (Ns N) and each of them will contribute to the prediction for any new input x in Eq. (4.115). The model will become slow and consumes memory storage when the size of the training dataset becomes large (e.g., N 100,000 for hourly data of an air pollutant collected over

217

218

Air quality monitoring and advanced bayesian modeling

Table 4.6 Kernels for SVR models used in air quality applications. Kernel

Functional form

Related studies

Linear

K(xi, x) ¼ xTxi

Radial Basis Function (RBF)/ Gaussian

K(xi, x) ¼ exp (mkxi 2 xk2)

Polynomial

K(xi, x) ¼ (xTxi + 1)d,d > 1

Sigmoid Wavelet

K(xi, x) ¼ tanh(xTxi + 1) m Q i K ðxi , xÞ ¼ φ x2x σ

Ha´jek and Olej (2012) Vong et al. (2012) Leong et al. (2020) Lu and Wang (2005) Ha´jek and Olej (2012) Vong et al. (2012) Moazami et al. (2016) Leong et al. (2020) Su et al. (2020) Balogun and Tella (2022) Liu et al. (2022) Ha´jek and Olej (2012) Vong et al. (2012) Leong et al. (2020) Vong et al. (2012) Vong et al. (2012)

j¼1

11.5 years). On the contrary, the model may underperform when only a few support vectors (ε is too small) can contribute to the prediction. The model may give a very small prediction for the input point where the adjacent support vectors are far away. As for the choice of C, it can affect the importance of the ε-insensitive loss term compared to the penalty against model complexity. When the choice of C is too small, there are little support vectors that can contribute to the prediction. Hence, the model will underperform. When C is larger than its optimal value or becomes infinite, the penalty given by 12 kwk2 becomes negligible in Eq. (4.93). Then, the most complicated model that uses all training points as support vectors will be chosen since it has the minimum ε-insensitive loss. Similar conclusion can be obtained from the optimization of the Lagrangian dual form in Eq. (4.108). The regularization constant C is the upper limit of the constraint for the dual variables βi and γ i in this equation. When C tends to infinity, the optimization becomes unconstrained and the model will be more likely to overfit the training data. The hyperparameters can be tuned by using the k-fold cross-validation.

Example 4.4.1. Kernel smoothing of the aerosol size distribution data based on the SVR regression From previous chapters, we already know that the atmospheric aerosols are important to the human health, visibility, and aerosol-cloud interactions. Researchers usually study their linkages through the aerosol size distribution which tells us the number concentrations of particles within different size ranges (Stahlhofen et al., 1989; Andreae et al., 2004;

Traditional statistical air quality forecasting methods

Roberts et al., 2008; Mohan and Payra, 2014; Kim, 2015; Finlay and Darquenne, 2020). The size distribution can be measured by using the Fast Mobility Particle Sizer (FMPS) or the Scanning Mobility Particle Sizer (SMPS). However, the measured data quality is subjected to the influence of the sampling artifacts such as the uncertainties of the sizedependent aerosol charging efficiency in the electrostatic classifier or the uncertainties in the sample flow rate, the DMA sheath flow rate, and the particle counter in the CPC (Wiedensohler et al., 2012, 2017). In this example, the support vector regression is used to improve the data quality through kernel smoothing of the measured aerosol size distribution. Before that, the SVR regression model is applied to simulated data for verification. The simulated aerosol size distribution in this example is made by the summation of three lognormal distributions with the following form: " 2 # 3 X log Dp log Dg,i dN Ni pﬃﬃﬃﬃﬃ (4.116) ¼ exp 2 d log Dp 2π log σ g,i 2 log σ g,i i¼1 where Dp represents the particle diameter; Ni represents the total number concentration of the ith log-normal distribution; and Dg, i and σ g, i represent the geometric mean particle diameter and the geometric standard deviation of the ith log-normal distribution, respectively. Table 4.7 presents the parameters of the log-normal modes used in the simulated data. The measured size distribution is obtained by adding the measurement noise with a signal-to-noise (S/N) ratio of 25 dB to the simulated distribution. The SVR model is built by using the Statistics and Machine Learning Toolbox of Matlab. The RBF kernel with m ¼ 1.7439 was used in the SVR model. The radius ε (case 1: 1, case 2: 61, case 3: 261) and the regularization constant C (case 1: 2310, case 2:8310, case 3: 8210) are obtained by using the 10-fold cross-validation. The root-mean-square errors of the SVR regression in three cases are 102.7 cm3 nm1 ( 30.9 dB), 71.2 cm3 nm1 ( 30.3 dB), and 198.5 cm3 nm1 ( 29.6 dB), respectively. The signal-to-noise ratio has improved substantially after kernel smoothing by the SVR regression model (Fig. 4.18). The SVM regression is further applied to perform smoothing of the measured size distribution by the TSI 3938 Scanning Mobility Particle Sizer. The SMPS was located Table 4.7 Parameters of log-normal modes used in the simulated data. 1st mode

Case 1 Case 2 Case 3

2nd mode

3rd mode

N1 (#cm23)

σ g, 1 (nm)

Dg, 1 (nm)

N2 (#cm23)

σ g, 2 (nm)

Dg, 2 (nm)

N3 (#cm23)

σ g, 3 (nm)

Dg, 3 (nm)

10,000 5000 10,000

1.5 1.5 1.5

5 10 20

5000 8000 10,000

1.5 3 1.5

50 100 50

2000

1.5

250

10,000

1.5

250

219

220

Air quality monitoring and advanced bayesian modeling

Fig. 4.18 Smoothed aerosol size distribution versus simulated distribution with 25 dB noise.

inside the laboratory at the ground floor of the faculty building at the University of Macau. The RBF kernel with m ¼ 1.7439 was used in the SVR model. The radius ε (case 1: 31, case 2: 151, case 3: 331) and the regularization constant C (case 1: 8310, case 2:6310, case 3: 8910) are chosen by using the 10-fold cross-validation. Fig. 4.19 shows 15-min averaged size distributions (ASD) measured by the SMPS at three different periods (case 1: 08/10/2019 08:30 PM, case 2: 02/11/2019 12:00 PM, case 3: 06/11/ 2019 01:30 PM). The diameter ranges from 15.7 nm to 661.2 nm with 105 size channels. The smoothed ASD (pink line) is better than the measured ASD (black dots). Therefore the application of the SVR model for kernel smoothing in this example is successful.

4.5 Case study 4.5.1 Overview Traditional techniques to develop the statistical air quality forecasting model (e.g., multiple linear regression, ridge regression, LASSO, classification and regression tree, multilayer perceptron, and support vector regression) have been introduced and

Traditional statistical air quality forecasting methods

Fig. 4.19 Smoothed aerosol size distribution versus measured distribution with SMPS.

demonstrated with illustrative examples or related applications in the previous sections. In this section, we focus on their applications to the air quality forecasting. The comparison of these techniques will be carried out through modeling the PM2.5 and ground level O3 concentrations at a coastal city in the Guangdong-Hong Kong-Macau Greater Bay Area of China between 2019 and 2020.

4.5.2 Prediction of PM2.5 and ground-level O3 concentrations of Macau The study area in this case study is a gaming and tourism city (Macau) located at the southwestern bank of the Greater Bay Area in China. The air quality data (from 2016 to 2020) were provided from the Macau Meteorological and Geophysical Bureau (SMG) which officially develops and maintains the automatic air quality monitoring network of Macau since 1999. Currently, this network consists of 6 monitoring stations located as in Fig. 4.20. Table 4.8 presents the characteristics and the pollutants monitored at each of these monitoring stations. The photos showing the ambient environment of

221

222

Air quality monitoring and advanced bayesian modeling

Fig. 4.20 Locations of air quality monitoring stations in Macau.

Table 4.8 Characteristics and pollutants monitored at air quality monitoring stations of Macau. Station name

Altitude (m)

Coordinates

Characteristics

Monitored pollutants

Roadside, Macao (PO)

11.8

22°110 4500 N 113°320 4100 E

High Density Residential Area, Macao (EN) High Density Residential Area, Taipa (TC) Ambient, Taipa (TG)

9.6

22°120 5000 N 113°320 3400 E

Ground level

22°090 3100 N 113°330 2000 E

Commercial, Residential, Roadside Commercial, High Density Residential Area High Density Residential Area

110

22°090 3600 N 113°330 5400 E

Peak, General environment

Ambient, Coloane (CD)

5.6

22°070 3100 N 113°330 1600 E

General environment

Roadside, Ka´-Ho´ (KH)

Ground level

22°070 5900 N 113°350 0100 E

Roadside

PM10, PM2.5, NO/ NO2/NOx, CO, VOCs PM10, PM2.5, NO/NO2/NOX, O3, CO, SO2 PM10, PM2.5, NO/NO2/NOX, O3, CO, SO2 PM10, PM2.5, NO/NO2/NOX, O3, CO, SO2 PM10, PM2.5, NO/NO2/NOX, O3, CO, SO2 PM10, PM2.5, NO/NO2/NOX, O3, CO, SO2

Traditional statistical air quality forecasting methods

these stations can be found on the website of the Macau SMG (https://www.smg.gov. mo/en). As PM2.5 and tropospheric O3 are the dominant air pollutants in Macau, these two pollutants were selected as the target pollutants to be forecasted in this case study. Based on the raw hourly data, the daily averaged PM2.5 concentrations (DA24) and the daily maximum of the 8-h averaged O3 concentrations (DMA8) were calculated. If there are less than 75% of hourly averaged PM2.5 concentrations within a 24-h period, the daily averaged PM2.5 concentration of that period is treated as missing value. In addition, the DMA8 of O3 is also treated as missing when there are less than 75% of 8-h moving averages of O3 concentrations within the 24-h period. From Jan. 1, 2016 to Dec. 31, 2020, the percentages of valid DA24 and DMA8 concentrations for all monitoring stations are presented in Table 4.9. The multilayer perceptron is used to impute the missing concentrations based on the concentrations of other correlated stations. After the MLP imputation step finishes, the cubic spline is further applied to impute the remaining missing concentrations where the inputs of the MLP imputation model are also missing. As there is no ozone measurement at the roadside station of Macau (PO), the associated column is filled with N/A in the table. To construct the statistical air quality models for PM2.5 and O3, common input variables used in literatures or recommended in the Guidelines for Developing an Air Quality (Ozone and PM2.5) Forecasting Program by US EPA are used as the input variables in the prediction models. Table 4.10 presents the symbols of the output variable and the predictor variables of the forecasting models. In the O3 forecasting model, the output variable is the 24-h ahead forecast of the DMA8 ozone concentration from 13:00 of the kth day to 12:00 of the (k + 1)th day. As for the input variables, the past O3 concentrations of the station adopted for model development and other monitoring stations [O3]d,k, [O3]TG,h,k, [O3]EN,h,k, [O3]TC,h,k, [O3]CD,h,k are used to indicate the initial condition Table 4.9 Percentage of valid concentrations before and after imputation. PO

EN

TC

TG

CD

KH

98.3 99.6 100

96.9 99.3 100

84.5 98.7 100

95.5 99.0 100

97.3 98.4 100

59.8 98.0 100

N/A N/A N/A

98.3 99.2 100

84.8 97.1 100

97.3 98.5 100

92.8 95.4 100

65.0 96.2 100

PM2.5

Percentage of valid data before imputation Percentage after imputation with MLP Percentage after imputation with MLP and cubic spline O3

Percentage of valid data before imputation Percentage after imputation with MLP Percentage after imputation with MLP and cubic spline

223

224

Air quality monitoring and advanced bayesian modeling

Table 4.10 Independent/dependent variables used in the PM2.5 and O3 prediction models.

Traditional statistical air quality forecasting methods

of the O3 concentrations of the study area. The [NO2]h,k denotes the NO2 concentration at the station adopted for model development and it represents the initial condition of the ozone precursors. As for the meteorological input variables, the symbols Tmax,k+1 and T850,k+1 denote the maximum hourly surface temperature and the daily averaged temperature at the 850mb, respectively. These two variables are related to the vertical stability of the atmosphere. The solar radiation SRk+1 reflects the intensity of photochemistry. The total precipitation Precipk+1 is also a surrogate of photochemistry and an indicator of wet deposition of NO2. The relative humidity RHk+1 is a surrogate of cloud cover and high relative humidity also favors conversion of NO2 to nitrate. The geopotential height H500,k+1 at 500mb represents the influence of synoptic-scale weather pattern on the local air quality. The symbol WSk+1 denotes the surface wind speed. The symbols Uk and Vk denote the north-south component and the east-west component of the surface wind direction, respectively. These variables are used to reflect the local dilution of air pollutants. Finally, the symbols WS850,k+1, U850,k+1, and V850,k+1 represent the aloft wind speed and the vector components of the aloft wind direction at 850mb, respectively. These variables can reflect the sources of the upwind areas (e.g., land/sea) which have different regional contribution of O3 and its precursors to the study area. As the air quality of Macau is generally dependent on the meteorological conditions of the same date, forecasted meteorological data provided from the Meteoblue history+ meteorological service are adopted in this study. The primary purpose of this case study is to compare the performance of different machine learning algorithms in air quality forecasting with the same input variables. Therefore the entire set of input variables in Table 4.10 is used in each model. In the next chapter, we will attempt pruning unnecessary input variables by using the Bayesian model class selection. Based on the given input variables in Table 4.10 as well as the available air quality/meteorological dataset (2016–20), the data of the first three years are used for training the statistical models (MLR, RR, LASSO, RF, MLP, SVR). The data of the last two years are left for the model validation. First, we start from presenting the results of the MLR models. Table 4.11 presents the estimated model coefficients of the MLR models at each station for PM2.5 and tropospheric O3. It is noted that the estimated coefficients of each input variable generally have consistent signs (+ or –) across all the stations. Only a few variables have opposite signs of coefficients across the stations and those are highlighted in the table. Next, the estimated coefficients in Table 4.11 are compared to those of the ridge regression and the LASSO regression. Figs. 4.21 and 4.22 show the ridge traces of the PM2.5 and O3 prediction models at the TG station and the variation of RMSEs with different choices of hyperparameter γ during cross-validation with the holdout ratio of 0.3. The optimal value of γ is the one corresponding to the minimum RMSE during cross-validation. Figs. 4.23 and 4.24 show the corresponding plots for the LASSO regression models. The optimal value of γ is also selected by the cross-validation with the same holdout ratio. Although it is noticed that some of the coefficients shrink

225

226

Air quality monitoring and advanced bayesian modeling

Table 4.11 Estimated model coefficients of the MLR models for PM2.5 and O3.

PM2.5 Variables Tmax,k+1 T850,k+1 SRk+1 Precipk+1 RHk+1 H500,k+1 WSk+1 Vk+1 Uk+1 WS850,k+1 V850,k+1 U850,k+1 [PM2.5]TG,h,k [PM2.5]EN,h,k [PM2.5]PO,h,k [PM2.5]TC,h,k [PM2.5] d,k O3 Variables Tmax,k+1 T850,k+1 SRk+1 Precipk+1 RHk+1 H500,k+1 WSk+1 Vk+1 Uk+1 WS850,k+1 V850,k+1 U850,k+1 [O3]TG,h,k [O3]EN,h,k [O3]TC,h,k [O3]CD,h,k [NO2]h,k [O3]d,k

Coefficients

TG -0.0168 -0.3733 0.0044 0.0543 -0.1621 0.0055 -0.3663 3.7448 1.7247 -0.0463 0.6112 -1.1681 0.0662 0.0004 0.1530 0.1093 0.2735

EN -0.6353 -0.2845 0.0096 0.0641 -0.1603 0.0083 -0.6003 2.8886 1.5369 0.0528 1.5592 -1.9526 -0.0052 0.2222 0.1758 0.0495 0.2122

Coefficients

TG 0.6315 -0.5318 0.0049 0.0902 0.0634 -0.0012 -0.0654 1.2964 2.8301 -0.0458 6.8117 0.1202 0.6652 0.0916 -0.0938 0.2052 0.4137 0.0974

EN 0.6910 -0.1542 0.0026 0.0892 0.0193 -0.0018 0.0307 5.8075 1.9484 -0.0615 4.8897 -0.8002 0.0091 0.6365 0.1436 0.0369 0.0735 0.1295

PO -0.6271 -0.3261 0.0095 0.0831 -0.1734 0.0090 -0.5295 2.8302 0.7740 0.0013 2.0214 -1.4492 0.0211 0.0791 0.2869 0.0542 0.1817

TC -0.5483 0.0253 0.0089 0.0431 -0.2462 0.0090 -0.5359 3.3488 2.9273 0.0196 2.2186 -1.8944 -0.0480 0.0689 0.1095 0.1908 0.2421

CD -0.3132 -0.2494 0.0037 0.0263 -0.1718 0.0072 -0.3758 5.1420 2.3595 -0.0063 1.1558 -2.0550 0.0069 0.0127 0.1623 0.1416 0.2906

KH -0.2018 -0.3133 0.0035 0.0334 -0.1820 0.0069 -0.3226 3.8923 2.7768 -0.0237 1.3310 -1.6998 0.0054 -0.0010 0.1521 0.1262 0.3037

TC 0.6439 -0.6543 0.0093 0.1100 0.1691 -0.0023 -0.0555 2.8696 3.8749 -0.0360 5.2638 -1.9486 -0.0639 0.1476 0.6707 0.0444 0.1336 0.1193

CD 0.5989 -0.1661 0.0050 0.0725 0.0693 -0.0024 0.0489 1.4332 5.1237 -0.0847 5.7227 -0.9070 0.1996 0.1335 0.0113 0.5192 0.2738 0.1204

KH 0.4854 -0.3489 0.0163 0.0722 0.0854 -0.0015 -0.0107 2.2727 4.4332 -0.0495 5.4615 -1.2773 0.2794 0.1331 -0.0468 0.4024 0.2862 0.1212

Traditional statistical air quality forecasting methods

Fig. 4.21 Plots of ridge trace and variation of RMSE with γ for PM2.5 at the TG station.

Fig. 4.22 Plots of ridge trace and variation of RMSE with γ for O3 at the TG station.

227

228

Air quality monitoring and advanced bayesian modeling

Fig. 4.23 Plots of LASSO trace and variation of RMSE with γ for PM2.5 at the TG station.

Fig. 4.24 Plots of LASSO trace and variation of RMSE with γ for O3 at the TG station.

Traditional statistical air quality forecasting methods

Table 4.12 Optimal estimates of hyperparameters for the ridge regression models and the LASSO regression models.

PM2.5

O3

TG 11 0.0121 TG 49 0.1951

EN 21 0.0101 EN 41 0.1431

Stations PO TC 16 17 0.0191 0.0101 TC 1e-04 0.0341

CD 18 1e-4 CD 3 0.0801

KH 8 1e-4 KH 34 0.0251

rapidly in the ridge traces or the LASSO traces, the signs of those coefficients remain unchanged at the optimal γ. Therefore the coefficients of the ridge regression and the lasso regression are generally consistent with those of the MLR models for PM2.5 and O3 at the TG station. Table 4.12 summaries the optimal hyperparameters used in the RR regression models and the LASSO regression models. Given the optimal hyperparameters in Table 4.12, the estimated coefficients of the RR models for PM2.5 and O3 at all monitoring stations are presented in Table 4.13. The coefficients which flipped their signs compared to those of the MLR models are highlighted with blue. There are two highlighted variables in the table and these variables include T850 in the PM2.5 prediction model at the TC station and the RH in the O3 prediction model at the EN station. Table 4.14 presents the estimated coefficients of the LASSO regression models. The highlighted cell refers to the variable with zero coefficient and this variable can be treated as redundant in the model. There are two highlighted cells corresponding to the same variable U850 at the EN and CD stations, respectively. Other coefficients in the ridge regression models and the LASSO regression models presented in Tables 4.13 and 4.14 are generally consistent with those of the MLP models in Table 4.11. As for the RF models, their performance is optimized by choosing appropriate number of trees and maximum number of input variables used at each node. Each RF model is run with different number of selected inputs m {1, …, 17} during random input selection and with different number of trees NT {10, 20, …, 200}. Figs. 4.25 and 4.26 show the rootmean-square errors for predicting the OOB samples of PM2.5 and O3 in the training set of the TG station based on different combinations of m and NT. It is noted that the RMSEs of both figures start to increase rapidly when the value of m is less thanp5.ﬃﬃﬃﬃﬃ This is consistent with the recommendations by the rule of thumb, where m ¼ 17 4:12 for the RF PM2:5 pﬃﬃﬃﬃﬃ model of PM2.5 and mO3 ¼ 18 4:24 for the RF model of O3. The optimal combination of m and NT (which is shown by the red dot in the figure) is chosen by the point which has the minimum RMSE of the OOB samples. Based on the optimal combinations, the importance of input variables in these two RF models is further examined by random permutation. Figs. 4.27 and 4.28 show the permutation importance of input variables in the RF models for PM2.5 and O3 at the TG station, respectively. For the RF model of PM2.5,

229

230

Air quality monitoring and advanced bayesian modeling

Table 4.13 Estimated model coefficients of the RR models for PM2.5 and O3.

PM2.5 Variables Coefficients Tmax,k+1 T850,k+1 SRk+1 Precipk+1 RHk+1 H500,k+1 WSk+1 Vk+1 Uk+1 WS850,k+1 V850,k+1 U850,k+1 [PM2.5]TG,h,k [PM2.5]EN,h,k [PM2.5]PO,h,k [PM2.5]TC,h,k [PM2.5] d,k O3 Variables Coefficients Tmax,k+1 T850,k+1 SRk+1 Precipk+1 RHk+1 H500,k+1 WSk+1 Vk+1 Uk+1 WS850,k+1 V850,k+1 U850,k+1 [O3]TG,h,k [O3]EN,h,k [O3]TC,h,k [O3]CD,h,k [NO2]h,k [O3]d,k

TG -0.0327 -0.3764 0.0037 0.0537 -0.1645 0.0056 -0.3629 3.4417 1.5412 -0.0465 0.6729 -0.9969 0.0666 0.0010 0.1528 0.1091 0.2753

EN -0.6478 -0.3041 0.0087 0.0634 -0.1619 0.0085 -0.5971 2.5271 1.1588 0.0526 1.4935 -1.6147 -0.0047 0.2233 0.1753 0.0495 0.2145

TG 0.6747 -0.6595 0.0055 0.0916 0.0608 -0.0011 -0.0882 1.6449 1.5927 -0.0370 5.1542 0.4779 0.6663 0.0958 -0.0897 0.2069 0.4239 0.1022

EN 0.6600 -0.2292 0.0003 0.0885 -0.0015 -0.0012 0.0264 4.7690 1.2313 -0.0598 4.3040 -0.2667 0.0122 0.6510 0.1419 0.0307 0.0784 0.1321

PO -0.6323 -0.3391 0.0088 0.0828 -0.1769 0.0091 -0.5277 2.5750 0.5874 0.0009 1.9482 -1.2519 0.0215 0.0800 0.2868 0.0543 0.1828

TC -0.5559 -0.0108 0.0081 0.0423 -0.2441 0.0092 -0.5342 3.0415 2.3943 0.0205 2.0856 -1.5342 -0.0473 0.0705 0.1083 0.1906 0.2454

CD -0.3406 -0.2612 0.0021 0.0252 -0.1763 0.0074 -0.3691 4.4996 1.9313 -0.0070 1.2394 -1.6575 0.0078 0.0142 0.1610 0.1408 0.2961

KH -0.2116 -0.3232 0.0030 0.0329 -0.1820 0.0069 -0.3207 3.6760 2.5332 -0.0235 1.3282 -1.5207 0.0058 -0.0003 0.1514 0.1261 0.3060

TC 0.6439 -0.6543 0.0093 0.1100 0.1691 -0.0023 -0.0555 2.8696 3.8749 -0.0360 5.2638 -1.9486 -0.0639 0.1476 0.6707 0.0444 0.1336 0.1193

CD 0.6001 -0.1815 0.0051 0.0722 0.0715 -0.0024 0.0464 1.4756 4.9194 -0.0834 5.5852 -0.8286 0.1995 0.1333 0.0118 0.5196 0.2744 0.1210

KH 0.4805 -0.4587 0.0164 0.0702 0.0941 -0.0013 -0.0276 2.2567 3.0107 -0.0409 4.4462 -0.6466 0.2769 0.1360 -0.0443 0.4077 0.2927 0.1250

Traditional statistical air quality forecasting methods

Table 4.14 Estimated model coefficients of the LASSO models for PM2.5 and O3.

PM2.5 Variables Coefficients Tmax,k+1 T850,k+1 SRk+1 Precipk+1 RHk+1 H500,k+1 WSk+1 Vk+1 Uk+1 WS850,k+1 V850,k+1 U850,k+1 [PM2.5]TG,h,k [PM2.5]EN,h,k [PM2.5]PO,h,k [PM2.5]TC,h,k [PM2.5] d,k O3 Variables Coefficients Tmax,k+1 T850,k+1 SRk+1 Precipk+1 RHk+1 H500,k+1 WSk+1 Vk+1 Uk+1 WS850,k+1 V850,k+1 U850,k+1 [O3]TG,h,k [O3]EN,h,k [O3]TC,h,k [O3]CD,h,k [NO2]h,k [O3]d,k

TG -0.0176 -0.4641 0.0048 0.0625 -0.1487 0.0193 -0.3640 3.6540 1.4553 -0.0402 0.6397 -1.1826 0.0673 0.0010 0.1559 0.1079 0.2711

EN -0.6368 -0.3616 0.0100 0.0713 -0.1488 0.0203 -0.5982 2.8068 1.3087 0.0579 1.5889 -1.9704 -0.0044 0.2229 0.1779 0.0480 0.2112

TG 0.5430 -0.5414 0.0044 0.0860 0.0351 -0.0003 -0.0662 0.4929 1.2214 -0.0363 5.8763 0.3009 0.6662 0.0993 -0.0903 0.2049 0.4382 0.1041

EN 0.5470 -0.2181 0.0027 0.0949 0.0149 0.0121 0.0024 5.0969 0.0899 -0.0445 4.2799 0 0.0120 0.6395 0.1462 0.0357 0.0806 0.1340

PO -0.6298 -0.3793 0.0095 0.0865 -0.1656 0.0155 -0.5274 2.7119 0.4358 0.0040 1.9628 -1.3072 0.0214 0.0803 0.2875 0.0538 0.1822

TC -0.5493 -0.1101 0.0097 0.0565 -0.2262 0.0312 -0.5328 3.2507 2.6465 0.0289 2.3194 -2.0230 -0.0469 0.0695 0.1140 0.1891 0.2379

CD -0.3128 -0.3183 0.0042 0.0337 -0.1615 0.0191 -0.3744 5.1208 2.2914 -0.0012 1.2387 -2.1837 0.0075 0.0125 0.1647 0.1404 0.2885

KH -0.2013 -0.4489 0.0045 0.0477 -0.1623 0.0300 -0.3200 3.8593 2.6575 -0.0139 1.4917 -1.9495 0.0069 -0.0010 0.1574 0.1240 0.2965

TC 0.6006 -0.6515 0.0092 0.1090 0.1692 -0.0019 -0.0598 2.6424 3.3256 -0.0340 5.0848 -1.5540 -0.0627 0.1478 0.6710 0.0439 0.1364 0.1204

CD 0.4919 -0.1481 0.0046 0.0701 0.0641 -0.0027 0.0311 0.9427 3.8255 -0.0786 5.2802 0 0.1998 0.1344 0.0145 0.5197 0.2805 0.1234

KH 0.4558 -0.3040 0.0158 0.0662 0.0779 -0.0099 -0.0165 2.1403 4.0912 -0.0513 5.2549 -0.9085 0.2790 0.1360 -0.0478 0.4040 0.2858 0.1214

231

232

Air quality monitoring and advanced bayesian modeling

Fig. 4.25 RMSE of OOB samples for different combinations of m and NT in the PM2.5 prediction model at the TG station.

Fig. 4.26 RMSE of OOB samples for different combinations of m and NT in the O3 prediction model at the TG station.

the bar graph suggests the past histories of PM2.5 are the most important predictor variables. Apart from the past histories, important predictor variables also include the aloft temperature (T850), the north-south components of the surface and aloft wind directions (V and V850), the relative humidity (RH), and the precipitation (Precip). For the RF model of O3, the past histories of O3 and NO2 are the most important predictor variables. Other important variables include the aloft temperature (T850), the north-south components of surface

Traditional statistical air quality forecasting methods

Fig. 4.27 Permutation importance of input variables in the RF model for PM2.5 at TG station.

Fig. 4.28 Permutation importance of input variables in the RF model for O3 at TG station.

233

234

Air quality monitoring and advanced bayesian modeling

and aloft wind directions (V and V850), the surface wind speed (WS), and the solar radiation (SR). Table 4.15 presents the optimal estimates of m and NT for the RF models of PM2.5 and O3 at all monitoring stations. The MLP models of PM2.5 and O3 are built by using the Neural Network Toolbox of Matlab. The hyperbolic tangent transfer function is used in the hidden layer. In each model, 70% of the data from 2016 to 2018 are used for model training and 30% of data are used for the cross-validation. The basic idea of cross-validation in the MLP training is to prevent overfitting by early stopping. At each iteration, the mean square error of the MLP in fitting the 30% cross-validated dataset is checked based on the weights and biases obtained from the 70% training data. If the error increases compared to that of the previous iteration, the training is stopped. In order to determine the required number of hidden neurons in each model, different number of hidden neurons ranging from 1 to 20 is attempted. The number of hidden neurons which corresponds to the minimum RMSE of fitting the training data is chosen. Table 4.16 presents the number of hidden neurons used in each MLP model. The optimal number in general is between 10 and 20 for the given input variables. The SVR models of PM2.5 and O3 are also built by using the Statistics and Machine Learning Toolbox of Matlab. The RBF kernel is used in each SVR model. The radius ε of the error insensitive tube and the regularization constant C in each model are selected by minimizing the error of the 10-fold cross-validation. This minimization is performed by using the Genetic Algorithm in the Global Optimization Toolbox of Matlab. The Genetic Algorithm performs optimization based on repeated crossover, mutation, and selection of chromosomes. Table 4.17 presents the optimal estimates of ε and C in each SVR model. Table 4.15 Optimal estimates of m and NT for the RF models of PM2.5 and O3.

PM2.5 m NT O3 m NT

TG 13 190 TG 14 170

EN 15 100 EN 17 130

PO 13 180

Stations

TC 12 70 TC 15 200

CD 10 180 CD 17 140

KH 12 110 KH 14 90

Table 4.16 Number of hidden neurons used in the MLP models of PM2.5 and O3.

PM2.5 h O3 h

TG 16 TG 17

EN 10 EN 10

PO 15

Stations

TC 8 TC 17

CD 15 CD 20

KH 10 KH 19

Traditional statistical air quality forecasting methods

Table 4.17 Optimal estimates of ε and C for the SVR models of PM2.5 and O3.

Finally, the performance of the prediction models for PM2.5 and O3 during training (in years 2016–18) and validation (in years 2019 and 2020) is summarized in the boxplots of Figs. 4.29 and 4.30. The performance of each model is measured by using the rootmean-square error (RMSE), the coefficient of determination (R2), the index of agreement (IA), and the mean fractional bias (MFB). The definitions of these indicators can be found in the previous chapters. In the boxplot, each box is a summary of the performance achieved by the model for a given performance measure among all monitoring stations. The red line within the box represents the median (2nd quartile) of the data. The upper end and the lower end of the box represent the 75th percentile (3rd quartile) and the 25th percentile (1st quartile) of the data. The difference between the 1st and 3rd quartiles is called the interquartile range (IQR). The vertical lines outside the upper and lower end of the box are called the whiskers. The tiny bars attached to the ends of the whiskers are the maximum and the minimum values of the data. If the maximum or minimum values are too far away from the interquartile region, they will be treated as outliers (+) in the boxplot. Therefore the box allows us to readily understand the central location of the distribution and how these data are spread out. It is a convenient tool to compare the distribution of the performance measure among different prediction models. For the prediction models of PM2.5 during training, the RF model performs consistently better than the other models. The SVR model outperforms the MLP model and the MLP model outperforms the remainders. Since the MLR, RR, and LASSO have fewer parameters than the other models, it is reasonable to see these models underperform during training. However, these techniques are more robust to modeling error. Therefore their performance could be more stable in the long run. For the prediction models of PM2.5 during validation, it is noted that the MLR, RR, and LASSO models perform consistently better than the other models and this agrees with our preliminary guess. For the prediction models of O3 during training, the RF model also performs consistently better than the other models. The SVR model consistently outperforms the MLP model and the MLP model outperforms the remainders except for the MFB. For the prediction models of O3 during validation, it is noted that the performance of MLR, RR, LASSO, and the RF models is close to each other. The MLP and the SVR models underperform compared to their counterparts. However, the difference

235

Fig. 4.29 Performance of the PM2.5 prediction models in Macau.

Fig. 4.30 Performance of the O3 prediction models in Macau.

238

Air quality monitoring and advanced bayesian modeling

in performance is less obvious than the case of PM2.5 as the performance of the MLP, RR, LASSO, and RF models for O3 also has large variability and it is indicated through their large interquartile ranges in RMSE and MFB.

References Andreae, M.O., Rosenfeld, D., Artaxo, P., Costa, A.A., Frank, G.P., Longo, K.M., Silva-Dias, M.A.F., 2004. Smoking rain clouds over the Amazon. Science 303, 1337–1342. https://doi.org/10.1126/ science.1092779. Agirre-Basurko, E., Ibarra-Berastegi, G., Madariaga, I., 2006. Regression and multilayer perceptron-based models to forecast hourly O3 and NO2 levels in the Bilbao area. Environ. Model. Software 21, 430–446. https://doi.org/10.1016/j.envsoft.2004.07.008. Althuwaynee, O.F., Balogun, A., Madhoun, W.A., 2020. Air pollution hazard assessment using decision tree algorithms and bivariate probability cluster polar function: evaluating inter-correlation clusters of PM10 and other air pollutants. GIsci. Remote Sens. 57, 207–226. https://doi.org/ 10.1080/15481603.2020.1712064. Abdullah, S., Ismail, M., Ahmed, A.N., Abdullah, A.M., 2019. Forecasting particulate matter concentration using linear and non-linear approaches for air quality decision support. Atmos. 10, 667. https://doi.org/ 10.3390/atmos10110667. Arroyo, A´., Herrero, A´., Tricio, V., Corchado, E., Woz´niak, M., 2018. Neural models for imputation of missing ozone data in air-quality datasets. Complexity 2018, 7238015. https://doi.org/10.1155/2018/ 7238015. Balogun, A., Tella, A., 2022. Modelling and investigating the impacts of climatic variables on ozone concentration in Malaysia using correlation analysis with random forest, decision tree algorithm, linear regression, and support vector regression. Chemosphere 299, 134250. https://doi.org/10.1016/j. chemosphere.2022.134250. Beig, G., Sahu, S.K., Anand, V., Bano, S., Maji, S., Rathod, A., Korhale, N., Sobhana, S.B., Parkhi, N., Mangaraj, P., Srinivas, R., Peshin, S.K., Singh, S., Shinde, R., Trimbake, K., 2021. India’s maiden air quality forecasting framework for megacities of divergent environments. SAFAR-project. Environ. Model. Softw. 145, 105204. https://doi.org/10.1016/j.envsoft.2021.105204. Bera, B., Bhattacharjee, S., Sengupta, N., Saha, S., 2021. PM2.5 concentration prediction during COVID-19 lockdown over Kolkata metropolitan city, India using MLR and ANN models. Environ. Challenges 4, 100155. https://doi.org/10.1016/j.envc.2021.100155. Bishop, C., 2006. Pattern Recognition and Machine Learning. Springer. Boyd, S., Vandenberghe, L., 2004. Convex Optimization. Cambridge University Press. Breiman, L., 1996a. Bagging predictors. Mach. Learn. 24, 123–140. https://doi.org/10.1007/BF00058655. Breiman, L., 1996b. Heuristics of instability and stabilization in model selection. Ann. Stat. 24, 2350–2383. https://doi.org/10.1214/aos/1032181158. Breimen, L., 2001. Random forests. Mach. Learn. 45, 5–32. https://doi.org/10.1023/A:1010933404324. Breimen, L., Friedman, J.H., Olshen, R.A., Stone, C.J., 1984. Classification and Regression Trees. Chapman and Hall CRC. Chattopadhyay, G., Midya, S.K., Chattopadhyay, S., 2019. MLP based predictive model for surface ozone concentration over an urban area in the Gangetic West Bengal during pre-monsoon season. J. Atmos. Sol. Terr. Phys. 184, 57–62. https://doi.org/10.1016/j.jastp.2019.01.008. Chelani, A.B., 2019. Estimating PM2.5 concentration from satellite derived aerosol optical depth and meteorological variables using a combination model. Atmos. Pollut. Res. 10, 847–857. https://doi.org/ 10.1016/j.apr.2018.12.013. Cobourn, W.G., Hubbard, M.C., 1999. An enhanced ozone forecasting model using air mass trajectory analysis. Atmos. Environ. 33, 4663–4674. https://doi.org/10.1016/S1352-2310(99)00240-X. Cortis, C., Vapnik, V., 1995. Support-vector networks. Mach. Learn. 20, 273–297. https://doi.org/ 10.1007/BF00994018.

Traditional statistical air quality forecasting methods

Demuzere, M., van Lipzig, N.P.M., 2010. A new method to estimate air-quality levels using a synoptic regression approach. Part I: Present-day O3 and PM10 analysis. Atmos. Environ. 44, 1341–1355. https://doi.org/10.1016/j.atmosenv.2009.06.029. Draper, N.R., Smith, H., 1998. Applied Regression Analysis. John Wiley & Sons. Drucker, H., Burges, J.C., Kaufman, L., Smola, A., Vapnik, V., 1996. Support vector regression machines. In: Advances in Neural Information Processing Systems 9, NIPS 1996. MIT Press, pp. 155–161. Elbisy, M.S., Ali, H.M., Abd-Elall, M.A., Alaboud, T.M., 2014. The use of feed-forward back propagation and cascade correlation for the neural network prediction of surface water quality parameters. Water Resour. 41, 709–718. https://doi.org/10.1134/S0097807814060153. Ekman, M., 2021. Learning Deep Learning: Theory and Practice of Neural Networks, Computer Vision, Natural Language Processing, and Transformers Using TensorFlow. Addison-Wesley Professional. Etchie, T.O., Etchie, A.Y., Jauro, A., Pinker, R.T., Swaminathan, N., 2021. Season, not lockdown, improved air quality using COVID-19 state of emergency in Nigeria. Sci. Total Environ. 768, 145187. https://doi.org/10.1016/j.scitotenv.2021.145187. Fabregat, A., Va´zquez, L., Vernet, A., 2021. Using machine learning to estimate the impact of ports and cruise ship traffic on urban air quality. Environ. Model. Software 139, 104995. https://doi.org/ 10.1016/j.envsoft.2021.104995. Fawagreh, K., Gaber, M.M., Elyan, E., 2014. Random forests: from early developments to recent advancements. Syst. Sci. Control Eng. 2, 602–609. https://doi.org/10.1080/21642583.2014.956265. Feng, X., Li, Q., Zhu, Y., Hou, J., Jin, L., Wang, J., 2015. Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation. Atmos. Environ. 107, 118–128. https://doi.org/10.1016/j.atmosenv.2015.02.030. Fernando, H.J.S., Mammarella, M.C., Grandoni, G., Fedele, P., Marco, R.D., Dimitrova, R., Hyde, P., 2012. Forecasting PM10 in metropolitan areas: efficacy of neural networks. Environ. Pollut. 163, 62–67. https://doi.org/10.1016/j.envpol.2011.12.018. Finlay, W.H., Darquenne, C., 2020. Particle size distributions. J. Aerosol Med. Pulm. Drug Deliv. 33, 178–180. https://doi.org/10.1089/jamp.2020.29028. Flake, G.W., Lawrence, S., 2002. Efficient SVM regression training with SMO. Mach. Learn. 46, 271–290. https://doi.org/10.1023/A:1012474916001. Friedman, J., Tibshirani, R., Hastie, T., 2010. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1–22. Gardner, M.W., Dorling, S.R., 1999. Neural network modelling and prediction of hourly NOx and NO2 concentrations in urban air in London. Atmos. Environ. 33, 709–719. https://doi.org/10.1016/S13522310(98)00230-1. Gass, K., Klein, M., Chang, H.H., Flanders, D.W., Strickland, M.J., 2014. Classification and regression tree for epidemiologic research: an air pollution example. Environ. Health 13, 17. http://www.ehjournal. net/content/13/1/17. Hagan, M.T., Menhaj, M., 1994. Training feed-forward networks with the Marquardt algorithm. IEEE Trans. Neural Netw. 5, 989–993. https://doi.org/10.1109/72.329697. Hagan, M.T., Demuth, H.B., Beale, M.H., 1996. Neural Network Design. PWS Publishing. Ha´jek, P., Olej, V., 2012. Ozone prediction on the basis of neural networks, support vector regression and methods with uncertainty. Eco. Inform. 12, 31–42. https://doi.org/10.1016/j.ecoinf.2012.09.001. Hand, D.J., Vinciotti, V., 2003. Local versus global models for classification problems: fitting models where it matters. Am. Stat. 57, 124–131. https://doi.org/10.1198/0003130031423. Hang, R., Liu, Q., Xia, G., Song, H., 2018. Correcting MODIS aerosol optical depth products using a ridge regression model. Int. J. Remote Sens. 39, 3275–3286. https://doi.org/ 10.1080/01431161.2018.1439597. Hastie, T., Tibshirani, R., Friedman, J., 2009. The Elements of Statistical Learning: Data Mining, Inference and Prediction, second ed. Springer Science & Business Media. Hertig, E., Schneider, A.E., Peters, A., von Scheidt, W., Kuch, B., Meisinger, C., 2019. Association of ground-level ozone, meteorological factors and weather types with daily myocardial infarction frequencies in Augsburg, southern Germany. Atmos. Environ. 217, 116975. https://doi.org/10.1016/j. atmosenv.2019.116975.

239

240

Air quality monitoring and advanced bayesian modeling

Hijmans, R.J., University of California, Berkeley Museum of Vertebrate Zoology, International Rice Research Institute, University of California, Davis, 2015. Global Administrative Areas Version 2.8. University of California, Berkeley. Museum of Vertebrate Zoology. Retrieved from https://earthworks. stanford.edu/catalog/stanford-jv457hb9421. Ho, T.K., 1998. The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844. https://doi.org/10.1109/34.709601. Hoerl, R.W., 2020. Ridge regression: a historical context. Technometrics 62, 420–425. https://doi.org/ 10.1080/00401706.2020.1742207. Hoerl, A.E., Kennard, R.W., 1970. Ridge regression: applications to nonorthogonal problems. Technometrics 12, 69–82. https://doi.org/10.2307/1267352. Hoi, K.I., Yuen, K.V., Mok, K.M., 2013. Improvement of the multilayer perceptron for air quality modelling through an adaptive learning scheme. Comput. Geosci. 59, 148–155. https://doi.org/10.1016/j. cageo.2013.06.002. Hong, C., Mueller, N.D., Burney, J.A., Zhang, Y., Aghakouchak, A., Moore, F.C., Qin, Y., Tong, D., Davis, S.J., 2020. Impacts of ozone and climate change on yields of perennial crops in California. Nat. Food 1, 166–172. https://doi.org/10.1038/s43016-020-0043-8. Horie, Y., 1988. Ozone Episode Representativeness Study for the South Coast Air Basin. Report Prepared for the South Coast Air Quality Management District, El Monte. CA by Valley Research Corporation, El Monte, CA. Hornik, K., Stinchcombe, M., White, H., 1989. Multilayer feedforward networks are universal approximators. Neural Netw. 2, 359–366. https://doi.org/10.1016/0893-6080(89)90020-8. James, G., Witten, D., Hastie, T., Tibshirani, R., 2021. An Introduction to Statistical Learning: With Applications to R, second ed. Springer Science & Business Media. Junninen, H., Niska, H., Tuppurainen, K., Ruuskanen, J., Kolehmainen, M., 2004. Methods for imputation of missing values in air quality data sets. Atmos. Environ. 38, 2895–2907. https://doi.org/10.1016/j. atmosenv.2004.02.026. Kim, K.W., 2015. Optical properties of size-resolved aerosol chemistry and visibility variation observed in the urban site of Seoul, Korea. Aerosol Air Qual. Res. 15, 271–283. https://doi.org/10.4209/ aaqr.2013.11.0347. Kim, S., Pan, S., Mase, H., 2019. Artificial neural network-based storm surge forecast model: practical application to Sakai Minato, Japan. Appl. Ocean Res. 91, 101871. https://doi.org/10.1016/j. apor.2019.101871. Kumar, A., Goyal, P., 2011. Forecasting of air quality in Delhi using principal component regression technique. Atmos. Pollut. Res. 2, 436–444. https://doi.org/10.5094/APR.2011.050. Lalas, D.P., Veirs, V.R., Karras, G., Kallos, G., 1982. An analysis of the SO2 concentration levels in Athens, Greece. Atmos. Environ. 16, 531–544. https://doi.org/10.1016/0004-6981(82)90162-7. Leong, W.C., Kelani, R.O., Ahmad, Z., 2020. Prediction of air pollution index (API) using support vector machine. J. Environ. Chem. Eng. 8, 103208. https://doi.org/10.1016/j.jece.2019.103208. Li, K., Jacob, D.J., Shen, L., Lu, X., Smedt, I.D., Liao, H., 2020a. Increases in surface ozone pollution in China from 2013 to 2019: anthropogenic and meteorological influences. Atmos. Chem. Phys. 20, 11423–11433. https://doi.org/10.5194/acp-20-11423-2020. Li, M.T., Monjardino, J., Mende, L., Gonc¸alves, D., Ferreira, F., 2020. Statistical forecast of pollution episodes in Macao during national holiday and COVID-19. Int. J. Environ. Res. Public Health 17, 5124. https://doi.org/10.3390/ijerph17145124. Li, W., Shao, L., Wang, W., Li, H., Wang, X., Li, Y., Li, W., Jones, T., Zhang, D., 2020b. Air quality improvement in response to intensified control strategies in Beijing during 2013-2019. Sci. Total Environ. 744, 140776. https://doi.org/10.1016/j.scitotenv.2020.140776. Liang, L., Daniels, J., 2022. What influences low-cost sensor data calibration?—A systematic assessment of algorithms, duration and prediction selection. Aerosol Air Qual. Res. https://doi.org/10.4209/ aaqr.220076. Article in press. Liu, B., Tan, X., Jin, Y., Yu, W., Li, C., 2021a. Application of RR-XGBoost combined model in data calibration of micro air quality detector. Sci. Rep. 11, 15662. https://doi.org/10.1038/s41598-02195027-1.

Traditional statistical air quality forecasting methods

Liu, C.C., Lin, T.C., Yuan, K.Y., Chiueh, P.T., 2022. Spatio-temporal prediction and factor identification of urban air quality using support vector machine. Urban Clim. 41, 101055. https://doi.org/10.1016/j. uclim.2021.101055. Liu, S., Liu, C., Hu, Q., Su, W., Yang, X., Lin, J., Zhang, C., Xing, C., Ji, X., Tan, W., Liu, H., Gao, M., 2021b. Distinct regimes of O3 response to COVID-19 lockdown in China. Atmos. 12, 184. https://doi. org/10.3390/atmos12020184. Loh, W.Y., 2014. Fifty years of classification and regression trees. Int. Stat. Rev. 82, 329–348. https://doi. org/10.1111/insr.12016. Luna, A.S., Paredes, M.L.L., de Oliveira, G.C.G., Corre`a, S.M., 2014. Prediction of ozone concentration in tropospheric levels using artificial neural networks and support vector machine at Rio de Janeiro, Brazil. Atmos. Environ. 98, 98–104. https://doi.org/10.1016/j.atmosenv.2014.08.060. Lu, H., Xie, M., Liu, X., Liu, B., Jiang, M., Gao, Y., Zhao, X., 2021. Adjusting prediction of ozone concentration based on CMAQ model and machine learning methods in Sichuan-Chongqing region, China. Atmos. Pollut. Res. 12, 101066. https://doi.org/10.1016/j.apr.2021.101066. Lu, W.Z., Wang, W.J., 2005. Potential assessment of the “support vector machine” method in forecasting ambient air pollutant trends. Chemosphere 59, 693–701. https://doi.org/10.1016/j. chemosphere.2004.10.032. Lv, L., Wei, P., Li, J., Hu, J., 2021. Application of machine learning algorithms to improve numerical simulation prediction of PM2.5 and chemical components. Atmos. Pollut. Res. 12, 101211. https://doi.org/ 10.1016/j.apr.2021.101211. Mendes, L., Monjardino, J., Ferreira, F., 2022. Air quality forecast by statistical methods: application to Portugal and Macao. Front. Big Data 5, 826717. https://doi.org/10.3389/fdata.2022.826517. Mishra, D., Goyal, P., 2015. Development of artificial intelligence based NO2 forecasting models at Taj Mahal, Agra. Atmos. Pollut. Res. 6, 99–106. https://doi.org/10.5094/APR.2015.012. Møller, M.F., 1993. A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw. 6, 525–533. https://doi.org/10.1016/S0893-6080(05)80056-5. Moazami, S., Noori, R., Amiri, B.J., Yeganeh, B., Partani, S., Safavi, S., 2016. Reliable prediction of carbon monoxide using developed support vector machine. Atmos. Pollut. Res. 7, 412–418. https://doi.org/ 10.1016/j.apr.2015.10.022. Mohan, M., Payra, S., 2014. Aerosol number concentrations and visibility during dense fog over a subtropical urban site. J. Nanomater., 495457. https://doi.org/10.1155/2014/495457. Mok, K.M., Tam, S.C., 1998. Short-term prediction of SO2 concentration in Macau with artificial neural network. Energ. Buildings 28, 279–286. https://doi.org/10.1016/S0378-7788(98)00024-3. Natural Resources Canada, 2022. 2022 Fuel Consumption Guides. Retrieved June 25, 2022, from https:// www.nrcan.gc.ca/sites/nrcan/files/oee/pdf/transportation/fuel-efficient-technologies/2022%20Fuel% 20Consumption%20Guide.pdf. Olcese, L.E., Palancar, G.G., Toselli, B.M., 2015. A method to estimate missing AERONET AOD values based on artificial neural network. Atmos. Environ. 113, 140–150. https://doi.org/10.1016/j. atmosenv.2015.05.009. Ottosen, T.B., Kuman, P., 2019. Outlier detection and gap filling methodologies for low-cost air quality measurements. Environ. Sci.: Processes Impacts 21, 701–713. https://doi.org/10.1039/C8EM00593A. Quinteros, M.E., Lu, S., Blazuez, C., Ca´rdenas-R, J.P., Ossa, X., Delgado-Saborit, J.M., Harrison, R.M., Ruiz-Rudolph, P., 2019. Use of data imputation tools to reconstruct incomplete air quality datasets: a case-study in Temuco, Chile. Atmos. Environ. 200, 40–49. https://doi.org/10.1016/j. atmosenv.2018.11.053. Peton, N., Dray, G., Pearson, D., Mesbah, M., Vuillot, B., 2000. Modelling and analysis of ozone episodes. Environ. Model. Software 15, 647–652. https://doi.org/10.1016/S1364-8152(00)00041-4. Platt, J., 1998. Fast training of support vector machines using sequential minimal optimization. In: Scholkopf, B., Burges, C., Smola, A. (Eds.), Advances in Kernel Methods—Support Vector Learning. MIT Press. Reani, M., Lowe, D., Gledson, A., Topping, D., Jay, C., 2022. UK daily meteorology, air quality, and pollen measurements for 2016–2019, with estimates for missing data. Sci. Data 9, 43. https://doi.org/ 10.1038/s41597-022-01135-6.

241

242

Air quality monitoring and advanced bayesian modeling

Roberts, G.C., Ramana, M.V., Corrigan, C., Ramanathan, V., 2008. Simultaneous observations of aerosolcloud-albedo interactions with three stacked unmanned aerial vehicles. Proc. Natl. Acad. Sci. U. S. A. 105, 7370–7375. https://doi.org/10.1073/pnas.0710308105. Roberson, I., 2014. Technical Guidance of Filling Missing Ozone Data for OLM and PVMRM Applications. Minnesota Pollution Control Agency. https://www.pca.state.mn.us/sites/default/files/aq2-69. pdf. Rodopoulou, S., Katsouyanni, K., Lagiou, P., Samoli, E., 2018. Assessing the cumulative health effect following short term exposure to multiple pollutants: An evaluation of methodological approaches using simulations and real data. Environ. Res. 165, 228–234. https://doi.org/10.1016/j.envres.2018.04.021. Shang, Z., Deng, T., He, J., Duan, X., 2019. A novel model for hourly PM2.5 concentration prediction based on CART and EELM. Sci. Total Environ. 651, 3043–3052. https://doi.org/10.1016/j. scitotenv.2018.10.193. Shannon, C.E., 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423. https:// doi.org/10.1002/j.1538-7305.1948.tb01338.x. Shevade, S.K., Keerthi, S.S., Bhattacharyya, C., Murthy, K.R.K., 2000. Improvement to the SMO algorithm for SVM regression. IEEE Trans. Neural Netw. 11, 1188–1193. https://doi.org/ 10.1109/72.870050. Silva, S.J., Ma, P.L., Hardin, J.C., Rothenberg, D., 2021. Physically regularized machine learning emulators of aerosol activation. Geosci. Model Dev. 14, 3067–3077. https://doi.org/10.5194/gmd-2020-393. Shahbazi, H., Karimi, S., Hosseini, V., Yazgi, D., Torbatian, S., 2018. A novel regression imputation framework for Tehran air pollution monitoring network using outputs from WRF and CAMx models. Atmos. Environ. 187, 24–33. https://doi.org/10.1016/j.atmosenv.2018.05.055. Shaziayani, W.N., UI-Saufie, A.Z., Mutalib, S., Noor, N.M., Zainordin, N.S., 2022. Classification prediction of PM10 concentration using a tree-based machine learning approach. Atmos. 13, 538. https://doi. org/10.3390/atmos13040538. Shehhi, M.R.A., Kaya, A., 2021. Time series and neural network to forecast water quality parameters using satellite data. Cont. Shelf Res. 231, 104612. https://doi.org/10.1016/j.csr.2021.104612. Sˇimic, I., Mario, L., Godec, R., Kr€ oll, M., Besˇlica, I., 2020. Applying machine learning methods to better understand, model and estimate mass concentrations of traffic-related pollutants at a typical street canyon. Environ. Pollut. 263, 114587. https://doi.org/10.1016/j.envpol.2020.114587. Smola, A.J., Sch€ olkopf, B., 2004. A tutorial on support vector regression. Stat. Comput. 14, 199–222. https://doi.org/10.1023/B:STCO.0000035301.49549.88. Stahlhofen, W., Rudolf, G., James, A.C., 1989. Intercomparison of experimental regional aerosol deposition data. J. Aerosol Med., 285–308. https://doi.org/10.1089/jam.1989.2.285. Stoimenova-Minova, M., 2020. Hybrid CART-ARIMA approach for PM10 pollutant modeling. AIP Conf. Proc. 2302, 060015. https://doi.org/10.1063/5.0033736. Su, A., An, J., Zhang, Y., Zhu, P., Zhu, B., 2020. Prediction of ozone hourly concentrations by support vector machine and kernel extreme learning machine using wavelet transformation and partial least squares methods. Atmos. Pollut. Res. 11, 51–60. https://doi.org/10.1016/j.apr.2020.02.024. Thomson, R.E., Emery, W.J., 2014. Data Analysis Methods in Physical Oceanography, third ed. Elsevier Science. Tibshirani, R., 1996. Regression shrinkage and selection via the LASSO. J. R. Stat. Soc. 58, 267–288. https://doi.org/10.1111/j.1467-9868.2011.00771.x. Tzanis, C.G., Alimissis, A., Koutsogiannis, I., 2021. Addressing missing environmental data via a machine learning scheme. Atmos. 12, 499. https://doi.org/10.3390/atmos12040499. Vlachokostas, C., Achillas, C., Moussiopoulos, N., 2011. Combining regression analysis and air quality modelling to predict benzene concentration levels. Atmos. Environ. 45, 2585–2592. https://doi.org/ 10.1016/j.atmosenv.2010.11.042. Vong, C.M., Ip, W.F., Wong, P.K., Yang, J.Y., 2012. Short-term prediction of air pollution in Macau using support vector machines. J. Control Sci. Eng. 2012, 518032. https://doi.org/10.1155/2012/518032. Voukantsis, D., Karatzas, K., Kukkonen, J., Rasanen, T., Karpinnen, A., Kolehmainen, M., 2011. Intercomparison of air quality data using principal component analysis, and forecasting of PM10 and PM2.5 concentration using artificial neural networks, in Thessaloniki and Helsinki. Sci. Total Environ. 409, 1266–1276. https://doi.org/10.1016/j.scitotenv.2010.12.039.

Traditional statistical air quality forecasting methods

Wiedensohler, A., Birmilli, W., Nowak, A., Sonntag, A., Weinhold, K., Merkel, M., Wehner, B., Tuch, T., Pfeifer, S., Fiebig, M., Fj€araa, A.M., Asmi, E., Sellegri, K., Depuy, R., Venzac, H., Villani, P., Laj, P., Aalto, P., Ogren, J.A., Swietlicki, E., Williams, P., Roldin, P., Quincey, P., H€ uglin, C., FierzSchmidhauser, R., Gysel, M., Weingartner, E., Riccobono, F., Santos, S., Gr€ uning, C., Faloon, K., Beddows, D., Harrison, R., Monahan, C., Jennings, S.G., O’Dowd, C.D., Marinoni, A., Horn, H.G., Keck, L., Jiang, J., Scheckman, J., McMurry, P.H., Deng, Z., Zhao, C.S., Moerman, M., Henzing, B., de Leeuw, G., L€ oschau, G., Bastian, S., 2012. Mobility particle size spectrometers harmonization of technical standards and data structure to facilitate high quality long-term observations of atmospheric particle number size distributions. Atmos. Meas. Tech. 5, 657–685. https://doi.org/10.5194/amt-5-657-2012. Wiedensohler, A., Wiesner, A., Weinhold, K., Birmili, W., Hermann, M., Merkel, M., Muller, T., Pfeifer, S., Schmidt, A., Tuch, T., Velarde, F., Quincey, P., Seeger, S., Nowak, A., 2017. Mobility particle size spectrometers: calibration procedures and measurement uncertainties. Aerosol Sci. Tech. 52, 146–164. https://doi.org/10.1080/02786826.2017.1387229. Yoo, K., Yoo, H., Lee, J.M., Shukla, S.K., Park, J., 2018. Classification and regression tree approach for prediction of potential hazards of urban airborne bacteria during Asian dust events. Sci. Rep. 8, 11823. https://doi.org/10.1038/s41598-018-29796-7. Zhang, Y., Wang, Q., Tian, J., Li, Y., Liu, H., Ran, W., Han, Y., Prev^ ot, A.S.H., Cao, J., 2022. Impact of COVID-19 lockdown on the optical properties and radiative effects of urban brown carbon aerosol. Geosci. Front. https://doi.org/10.1016/j.gsf.2021.101320. Article in press.

243

CHAPTER 5

Advanced Bayesian air quality forecasting methods Contents 5.1 Overview of technique limitations and advanced topics for improvement 5.1.1 Choice of model complexity 5.1.2 Necessity of model adaptiveness 5.2 Bayesian model class selection of linear regression model 5.2.1 Overview 5.2.2 Basics of Bayesian model class selection in linear regression model 5.2.3 Modeling of Keeling curve 5.3 Kalman filter-based adaptive air quality model 5.3.1 Overview 5.3.2 Basics of Kalman filter-based adaptive air quality model 5.3.3 Selection of perturbation matrix and measurement noise variance 5.3.4 Revisiting example 5.2.3 (modeling of Keeling curve) with the adaptive linear model 5.4 Time-varying multilayer perceptron 5.4.1 Overview 5.4.2 Basics of time-varying multilayer perceptron 5.4.3 Example: Prediction of Mackey–Glass time series by using the TVMLP model 5.5 Adaptive Bayesian model averaging of multiple time-varying regression models 5.5.1 Overview 5.5.2 Basics of dynamic Bayesian model averaging 5.5.3 Modeling of measured PM2.5 concentration of the low-cost sensor 5.6 Case study 5.6.1 Overview 5.6.2 Air quality forecasting in Macau with the adaptive linear models References

246 246 248 250 250 250 256 261 261 262 267 270 273 273 273 283 286 286 287 291 298 298 298 306

In Section 5.1, we will introduce the drawbacks of the traditional methods and some advanced techniques to improve some of the statistical air quality forecasting methods described previously in Chapter 4. In an air quality forecasting model, the air pollutant concentration to be forecasted can be related to many potential explanatory variables. Model developers need to decide the appropriate combination during the coding process. In Section 5.2, the Bayesian model class selection will be presented for selecting an efficient and robust input combination in the forecasting model. Apart from the selection of input variables, another challenge is the model adaptiveness. Even though the

Air Quality Monitoring and Advanced Bayesian Modeling https://doi.org/10.1016/B978-0-323-90266-3.00003-0

Copyright © 2023 Elsevier Inc. All rights reserved.

245

246

Air quality monitoring and advanced bayesian modeling

input variables are systematically selected, the model parameters are obtained based on a given set of training data and are fixed in the operational forecasting before retraining with new batch of data. In order to minimize the effort of retraining the model, Sections 5.3 and 5.4 will be given to describe an adaptive air quality model with the Kalman filter. In Section 5.5, we will introduce how to combine the predictions from an ensemble of adaptive air quality models. The adaptive Bayesian model averaging technique will be presented in Section 5.5 to continuously evaluate the probabilities of all adaptive models in a forecasting system. These probabilities are then used to provide a weighted average forecast. Case study is provided to illustrate these techniques in Section 5.6.

5.1 Overview of technique limitations and advanced topics for improvement 5.1.1 Choice of model complexity Model complexity is an important issue in the development of statistical models. When the choice of the model is too simple, the model tends to underfit the training data and there will be a large fitting error. On the contrary, a complex model can fit the training data very well or even pass through all of them exactly but such an model may overfit the data. Besides learning the general pattern of data from the underlying mechanism of the system, the complex model also adapts to some unwanted details due to measurement noise. Consequently, the complex model may produce inaccurate predictions for the unseen input data. A better illustration of overfitting or underfitting is made by considering the simple example of curve fitting shown in Fig. 5.1. The data (37 red points) in the figure are generated by the following function (black line) with measurement noise having a signal-tonoise ratio (SNR) of 15.71 dB: y ¼ 1:5 sin

πx 3πx + 0:5 sin , x½180o , 180o 180 180

where the SNR is defined by: SNR ¼ 20 log 10

RMSsignal RMSnoise

(5.1)

(5.2)

The unit of the SNR is decibel (dB). A positive SNR implies that the root-meansquare value of the signal is larger than that of the background noise and vice versa. From Fig. 5.1, the training data was fitted with polynomials of different orders (magenta lines). It is noted that the 5th order polynomial underfits the curve and cannot capture the ripples on the crest and trough. When the 10th order polynomial is used, it essentially captures the overall pattern of the actual curve. As polynomials of more complex orders

Advanced Bayesian air quality forecasting methods

Fig. 5.1 Plots of actual curve (black), training data (red dots), and polynomials of different orders (magenta) fitted to the training data.

are used, they achieve a better goodness of fit for the data points. However, this slight improvement leads to spurious response of the fitted polynomials at locations without the support of training data. Similar speculation is found in Fig. 5.2 which shows the RMSE of the polynomials during training and validation. For training, the RMSE of the polynomial decreases with the order of the polynomial. As for validation, the RMSEs of the polynomials vary like a bowl-shaped curve. As the model order increases from 5 to 10, the underfitting problem is improved by using higher order terms in the model. When the order of the polynomial is further increased, the RMSE during validation also increases. As the model becomes more complex, its performance during validation can be very sensitive to the estimation error of the model parameters due to overfitting. Therefore caution should be made in the model selection process. If the optimal model was only chosen by minimizing the RMSE of the training data, the most complex model would be chosen. Therefore it is necessary to penalize more complex model classes while searching for the model which gives sufficient goodness of fit to the data.

247

248

Air quality monitoring and advanced bayesian modeling

Fig. 5.2 RMSE of training and validation for polynomials of different orders.

This idea was first proposed by H. Jeffreys who pointed out the needs to have the quantitative expression for the Ockham’s razor ( Jeffreys, 1961; Sivia, 1996; Yuen, 2010). The Ockham’s razor basically means one should not complicate explanations when simple ones suffice (Kovac and Weisberg, 2012). When there are two completing hypothesis, the simpler one with fewer assumptions should be preferred. If this principle is applied to model selection, the hypothesis should refer to the model class, and the assumption should mean the number of parameters in the model. Box and Jenkins also pointed the needs for the same principle of parsimony in the time series forecasting (Box and Jenkins, 1970). In recent years, the Bayesian approach has been widely applied to model selection in different disciplines because it can estimate the evidence of each model and it was shown that the model evidence automatically enforces a quantitative expression of the principle of model parsimony (Mackay, 1992; Beck and Yuen, 2004; Yuen, 2010; Yuen and Kuok, 2010; Jenkins and Peacock, 2011; Mok et al., 2017; Gamse et al., 2018; Prakash and Balomenos, 2021). It is not necessary to introduce any ad hoc penalty term for model complexity. In this chapter, the application of Bayesian methods in model selection of linear regression-based air quality models will be introduced in Section 5.2.

5.1.2 Necessity of model adaptiveness Apart from the problems of overfitting, the performance of statistical air quality models can be also degraded by “concept drift” (Gama et al., 2014; Moreno-Torres et al., 2012; Bayram et al., 2022; Lee and Park, 2022). Although there are various linear/nonlinear modeling options in the previous chapter, those machine learning models are usually trained in an offline fashion. The models are trained based on historical data of input vector x and the measured output z. Based on the training dataset Z ¼ {(x1, z1), β is estimated according to the criterion …, (xN, zN)}, the optimal parameter vector b

Advanced Bayesian air quality forecasting methods

of maximum likelihood. In other words, we have obtained a predictive model that can be b . After the training represented by a conditional probability distribution of p zjx; β completes, the model is applied to air quality prediction for the unseen input data based on the model parameter b β. Suppose that the conditional predictive model during the actual application is the same as the training phase. When the model is used to give predictions for the input data which are far away from the training domain, the corresponding model performance may degrade. Assume that a low-cost PM2.5 sensor is calibrated against the collocated measurements of TEOM (with heater) and the relative humidity sensor. Suppose the calibration is made in winter and the relative humidity range is between 40% and 60%. Then, the calibrated sensor correction model is applied to correct readings for the following spring and summer. It will be found that same sensor correction model obtained in winter may overestimate the particle mass concentration in spring or summer where x ¼ {[PM2.5]LCS, RH} are the input variables of the sensor correction model. When the RH reaches 90% in spring or summer, there is a significant uptake of water by the hygroscopic aerosols such as sulfate, nitrate, and sea salt (Hu et al., 2010; Li et al., 2021). This enlarges the diameters of the particles detected by the low-cost sensor, while those detected by TEOM are not affected since its heater removes the moisture in the sampling tube. Another situation of concept shift is that the conditional predictive model of the application period is different from that of the training period pApp zjx; b β 6¼ pTra zjx; b β . This may be caused when the underlying mechanism of the system has already changed. Hence, the model still degrades even when the same input data are presented to the model (x0 ¼ x or pApp(x) ¼ pTra(x). For example, a prediction model of CO concentration was developed with the meteorological variables (e.g., wind speed, wind direction, and mixing height) as the model inputs. Assume that the model was trained by using the CO concentration and the meteorological data during the winter period of 2018. This model may be invalid for another winter during the COVID-19 lockdown. As the lockdown causes dramatic reduction in the human mobility, the restriction measures are followed by dramatic reduction in the emissions from transportation, industrial, and social activities (Putaud et al., 2021; Liu et al., 2021). It is expected that the overall relationship between the CO concentration and the meteorological variables has chan ged. Also, both conditional predictive models p zjx; b β and p zjx; b β Lockdown

Prelockdown

should not be equal even when the meteorological conditions during both winter periods are identical. To reduce the problem of concept drift in statistical air quality models, it is necessary to retrain the model regularly or update the model continuously through adaptive learning. As it is subjective to judge the appropriate frequency to retrain the model, adaptive

249

250

Air quality monitoring and advanced bayesian modeling

learning is comparatively more preferable. Coefficients within the adaptive model are time dependent and are updated continuously at every time step by using the real-time measurement. This resolves the problem of a changing system/relationship. The realtime measurement is either the output response or other proxy measurements. In this chapter, the Kalman filter-based adaptive statistical air quality models will be introduced in Sections 5.3 and 5.4.

5.2 Bayesian model class selection of linear regression model 5.2.1 Overview As mentioned in the previous section, the linear regression-based air quality model attempts to model an air quality parameter by using the linear combination of the input variables. The modeling performance is dependent on how the input variables/functions are chosen or specified in the linear regression-based air quality model. When a simple functional form with fewer input variables or simpler input functions is chosen, it may not be capable to model the nonlinear relationship between the input variables and the target variable. When the model is too complex, it tends to overfit the data in the training phase and, however, may underperform in the prediction phase. In order to achieve the trade-off between the model efficiency and robustness, the Bayesian model class selection is introduced in this section to perform variable selection of the linear regression model. First of all, the formulation of the Bayesian class selection is presented. Then, an application is presented to demonstrate the technique.

5.2.2 Basics of Bayesian model class selection in linear regression model Consider a linear regression-based air quality model which has the following form: zk ¼ β0 + β1 x1,k + ⋯ + βm xm,k + nk

(5.3)

where the symbol zk ≡ z(kΔt) stands for the target variable observed at the kth time step and Δt is the sampling interval or the averaging interval (month, day, hour, etc.). In Eq. (5.3), the target variable is modeled with a linear combination of the input variables/ functions, where β1, …, βm represent the model coefficients and β0 is the model intercept. If all input variables are standardized, the model coefficients can also indicate the sensitivities of input variables to the model output. The symbol nk represents the modeling error which is normally distributed with zero mean and standard deviation σ. Given a set of input variables X1, …, Xm, each linear regression-based air quality model candidate Aj can be constructed by a specific linear combination of input variables, e.g., β1x1, β0 + β2x2, β1x1 + β3x3. Therefore m input variables together with the model intercept can form a maximum of 2m+1–1 model candidates. The null combination is trivial, and it can be removed from the 2m+1 combinations in total. To select the most probable air quality model class Aˆ from the pool of model candidates {A1, …,ANA}, the aim of the Bayesian

Advanced Bayesian air quality forecasting methods

model class selection is to calculate the probability of each model candidate P(AjjZ) based on the given monitoring data Z. The model candidate which has the maximum P(AjjZ) is selected as the most probable air quality model among the model candidates: b ¼ argmax P Aj jZ , j ¼ 1,…,N A A (5.4) j

From the Bayes’ theorem, the conditioning of events can be interchanged and the probability of model candidate P(AjjZ) conditional on the monitoring data Z can be rearranged as follows: p ZjAj P Aj (5.5) P Aj jZ ¼ pðZÞ where p(ZjAj) represents the model evidence, indicating how well the model candidate Aj captures the monitoring data Z. The term P(Aj) represents the prior probability of the model candidate specified by the user. If the user does not have preference on any model candidate, a uniform prior can be assumed so that P(Aj) ¼ 1/NA. For small amount of input variables, the computational time is not a matter of concern. NA is equal to the maximum linear combinations 2m+1–1 so all model combinations will be evaluated. The term p(Z) represents the normalizing constant so that the summation of P(AjjZ) for all model candidates equals to 1. From the law of total probability, the model evidence can be expressed by the following integral over the parametric space Ω: ð p ZjAj ¼ p Zjβ; Aj p βjAj dβ, where β Ω ℝm+1 (5.6) Ω

Inside the integral, the first term of the integrand p(Zj β; Aj) denotes the likelihood, which represents the goodness of fit based on the given parameter vector β in the model candidate Aj. The parameter vector β is a (m + 1) 1 column vector β ¼ [β0,…, βm]T containing the model intercept and the other model coefficients. The second term of the integrand p(βj Aj) represents the prior probability density function (prior PDF) of the model parameter vector in the model candidate Aj. The prior PDF p(βjAj) represents the prior knowledge of the user on the parameters of the model candidate. As it is difficult to obtain analytical solution for this integral, the alternative way is to compute this integral by using the numerical integration. Assume that a regular grid is used and about p intervals are used in each dimension. Then, there will be (p + 1)m+1 grid points in the parametric space. Therefore the computational effort of the numerical integration has the order of τ(p + 1)m+1, where τ is the averaged time required to compute the value of the integrand. Assume that m ¼ 5, p ¼ 10, and each grid point requires 0.0001 s of computation time. It will take about 177 h to compute the model evidence for a single model candidate with 5 input variables and 10 intervals. Indeed, 10 intervals for each dimension can only provide a very rough estimate of the model evidence. As a linear regression-

251

252

Air quality monitoring and advanced bayesian modeling

based air quality model usually consists of many model inputs, the computation of model evidence with numerical integration requires a lot of computational time and is basically impractical. Therefore the asymptotic expansion is utilized in this chapter to compute the model evidence. To derive this asymptotic expansion, we start with the posterior probability density function of the parameter vector p(βjZ;Aj) for a model candidate Aj. Unlike the prior PDF, the monitoring dataset Z are already given. The given information updates our estimates of the model parameters and the corresponding uncertainties. As the probability of the sample space is always equal to 1, the integration of the posterior PDF p(βjZ;Aj) over the parametric space is equal to 1: ð p βjZ; Aj dβ ¼ 1 (5.7) Ω

For a linear regression problem, the posterior PDF p(βjZ;Aj) is Gaussian T 1 1 1 b b p βjZ; Aj ¼ exp β β Σ ββ 2 ð2π Þðm+1Þ=2 jΣj1=2

(5.8)

The multivariate Gaussian distribution has the mean parameter vector b β and the covariance matrix Σ. The mean parameter vector represents the optimal estimates of the parameters based on the monitoring dataset Z. Each diagonal element Σi,i of the covariance matrix denotes the variance which can infer the uncertainty of the parameter estimate, whereas the off-diagonal elements Σi,j represent the covariance between the variables Xi and Xj for all i 6¼ j. Eq. (5.7) can be rewritten in the following form: ð

exp ln p βjZ; Aj dβ ¼ 1 (5.9) Ω

By expressing p(βjZ;Aj) as its second-order Taylor series expansion around the mean parameter vector b β, Eq. (5.9) becomes: " " # #T ð 1 T ∂lnp βjZ; A j ^ Aj + ^ + ^ dβ ¼ 1 ^ H ββ exp ln p βjZ; ββ ββ ^ ∂β 2 Ω β5β (5.10)

The gradient of lnp(βj Z; Aj) with respect to β vanishes at the stationary point. The matrix H is called the Hessian matrix, which is defined as the second-order derivative of lnp(βj Z; Aj) with respect to the parameter vector β. Each matrix element Hr,s is equal to the corresponding second-order derivative of lnp(βj Z; Aj) with respect to βr and βs evaluated at their optimal estimates:

Advanced Bayesian air quality forecasting methods

∂2 lnp βj Z; Aj Hr , s ¼ ^ ∂βr ∂βs βr ¼βr , βs ¼β^s ,

(5.11)

When β | Z; Aj follows a multivariate Gaussian distribution, the natural logarithm of p(βj Z; Aj) is equal to: T

m+1 1 1 ln p βjZ; Aj ¼ β (5.12) ln ð2π Þ ln ðjΣjÞ β b β Σ1 β b 2 2 2 T Further elaboration of Eq. (5.12) is done by expanding the product of βT b β and b : Σ1 β Σ1 β X1

m+1 1 X 1 T ^T X1 ^ β2 ln p βj Z;Aj ¼ lnð2π Þ ln β β β 2 2 2 X1 X1 m+1 1 X 1 T X1 ^T ^T ^ ¼ β 2β β+β β ln ð2π Þ ln β 2 2 2 (5.13) By differentiating Eq. (5.13) twice with respect to β, it is found that the Hessian matrix is equal to the negative of the inverse of the covariance matrix: H ¼ 2Σ1

(5.14)

Substituting Eq. (5.14) into Eq. (5.10), we have: ð T X1 1 ^ ^ ^ exp lnp β Z; Aj β2β β2β dβ ¼ 1 2 Ω T X1 ð 1 ^ ^ dβ ¼ 1 ^ Z; Aj (5.15) exp β2β β2β p β 2 Ω X 1=2 ^ Z; Aj ð2π Þðm + 1Þ=2 p β ¼1 β can From the Bayes’ theorem, the posterior probability density p b βjZ; Aj at β ¼ b be evaluated by: p Zjb β; Aj p b βjAj p b βjZ; Aj ¼ (5.16) p ZjAj By substituting Eq. (5.16) into Eq. (5.15), the asymptotic expression for computing the evidence of each model candidate is obtained (Beck and Yuen, 2004; Yuen, 2010): p ZjAj ¼ p Zjb β; Aj p b (5.17) βjAj ð2π Þðm+1Þ=2 jΣj1=2

253

254

Air quality monitoring and advanced bayesian modeling

The first term on the right-hand side of Eq. (5.17) is the likelihood factor, which represents the goodness of fitting the training data Z based on the given model candidate Aj and the optimal parameter vector b β learned in the training phase through Eq. (4.16). The likelihood factor p Zjb β; Aj is computed by: p Zjb β; Aj ¼

"

1 ð2π ÞN =2 σ N

N 1 X exp 2 ðz zbk Þ2 2σ k¼1 k

# (5.18)

where zbk denotes the model prediction at the kth time step and zk zbk is the modeling error nk. In Eq. (5.18), the only unknown is the variance σ 2. Therefore we can differentiate Eq. (5.18) with respect to σ 2 and set it to zero in order to calculate b σ 2:

2 XN 3 2XN 3 2 XN 3 2 2 2 N ð Þ ð Þ ð Þ n n n 2 N2 1 1 N k¼1 k 5 4 k¼1 k 5 k¼1 k 5 b σ 2 2 exp 4 b σ exp 4 ¼0 2 2 2 2b σ 2b σ 2b σ2 8 3 2 XN 39 2XN 2 = 2 100,000 MOP). Hence, the low-cost sensor network can be deployed in the city at a much finer spatial resolution compared to the conventional network. Meanwhile, other advantages which outperform the conventional monitors include their compact sizes, easy installation, low power consumption, cloud storage, and remote data access and calibration. Despite these advantages, previous studies showed their accuracies are relatively low (Borrego et al., 2016; Feenstra et al., 2019). The sensor performance is sensitive to the ambient pollution level and other environmental variables such as temperature and relative humidity (Crilley et al., 2018; Liang, 2021). If the environmental conditions where the sensor is deployed (e.g., aerosol chemical composition, particle shape and refractive index, and pollution level) are markedly different from those of the factory calibration, the sensors may yield unsatisfactory performance. To correct these interferences, the low-cost sensors need field calibration with the reference monitors. Sensor correction models are trained with the gathered data and the calibrated models are further applied to correct the sensor readings. In this example, we will attempt to develop a model relating the PM2.5 concentration of the low-cost sensor (LCS) manufactured by Purple Air with the PM2.5 concentration of the Tapered Element Oscillating Microbalance (TEOM). Table 5.5 presents the specification of the Purple Air LCS. The low-cost sensor detects the PM2.5 concentration based on the internal laser particle counter. The data are open to public and can be directly downloaded from the company website Table 5.5 Specification of Purple Air PM2.5 low-cost sensor. Item

Specification

Dimension

3.5 in 3.5 in 5 in (85 mm 85 mm 125 mm) 12.6 oz. (357 g) Plantower PMS 5003 0.3, 0.5, 1.0, 2.5, 5.0, and 10 μm 50% at 0.3 μm and 98% at 0.5 μm 10 μg/m3 at 0–100 μg/m3

Weight Particulate Sensor Range of measurement Counting efficiency Maximum consistency error

Advanced Bayesian air quality forecasting methods

(https://map.purpleair.com/). In this table, the counting efficiency of 50% corresponds to the particle size of 0.3 μm. This particle diameter usually refers to the minimum detectable diameter of the optical particle counter. As the sensor relies on laser light scattering to determine the amount of particles in the sample volume of air drawn into the sensor, particles which are smaller than the Mie scattering regime may not produce enough light scattering for particle counting. This is also another factor which causes the LCS reading to differ from that of the reference monitor of particulate mass concentrations (e.g., TEOM) besides the influencing factors mentioned before. The Purple Air sensor chosen in this example is the one located at the North point of the Hong Kong Island shown in Fig. 5.12. The closest reference monitor selected to provide the model input is the TEOM PM2.5 monitor located at the roadside air quality station of Causeway Bay. The shortest distance between the TEOM monitor and the Purple Air sensor is about 2 km. Fig. 5.13 shows the measured hourly averaged PM2.5 concentrations by Purple Air (top) and TEOM (bottom) between April 11, 2021 (00:00 local time) and May 1, 2021 (00:00 local time). As there are about 3.96% of missing data in TEOM, the cubic

Fig. 5.12 Map of Hong Kong Island and locations of the reference monitor (TEOM) and the low-cost sensor (Purple Air).

293

294

Air quality monitoring and advanced bayesian modeling

Fig. 5.13 Measured PM2.5 concentrations by the Purple Air Low-Cost Sensor at the North Point and the TEOM monitor at the Causeway Bay (Top—blue line, Bottom—Red line), and Estimated PM2.5 concentrations by the cubic spline at the Causeway Bay (Bottom—Blue dotted line).

spline is used to fill the gap. The basic idea of the cubic spline interpolation is to approximate the missing data by fitting a set of piecewise cubic functions passing through the available data points (Kong et al., 2020). In this method of interpolation, the secondorder derivatives of the first and last data point are assumed to be zero. Besides, it is assumed to have continuous zero to second-order derivatives at each intermediate data point. The red line in the bottom panel of Fig. 5.13 represents the raw hourly averaged concentration measured by TEOM, while the blue dotted line represents the hourly averaged concentration estimated by cubic spline interpolation. By following the procedures of application presented in Table 5.4, the first step is to propose a set of Kalman filter-based air quality model candidates. In this example, five model candidates were used and Table 5.6 presents the variables used in each model candidate. The output variable is the hourly averaged PM2.5 concentration of the Purple Air at the kth hour, while the input variables include the hourly averaged PM2.5 concentration of the TEOM from the kth hour to (k-5)th hour. Then, all five adaptive model candidates are run continuously so that the coefficients of each model candidate are updated at each hour based on the innovation gained from the measured concentration by Purple Air. The hourly probabilities of five model candidates are also calculated by using this dataset and are shown in Fig. 5.14. At the beginning, all model candidates have the same prior probability of 1/5. After a few hours of updating, the probability of the model candidate A5 given the measurements Zk starts to be the highest and lasts for the entire period. By using the estimated hourly probabilities, the weighted average of the estimated PM2.5 concentrations is obtained from the outputs of different model candidates based on Eq. (5.128).

Advanced Bayesian air quality forecasting methods

Table 5.6 Variables used in the Kalman filter-based air quality model candidates. Model

Dependent variable

Independent variable(s)

A1 A2 A3

[PM2.5]Purple

[PM2.5]TEOM,k, [PM2.5]TEOM,k-1 [PM2.5]TEOM,k, [PM2.5]TEOM,k-1, [PM2.5]TEOM,k-2 [PM2.5]TEOM,k, [PM2.5]TEOM,k-1, [PM2.5]TEOM,k-2, [PM2.5]TEOM,k-3 [PM2.5]TEOM,k, [PM2.5]TEOM,k-1, [PM2.5]TEOM,k-2, [PM2.5]TEOM,k-3, [PM2.5]TEOM,k-4 [PM2.5]TEOM,k, [PM2.5]TEOM,k-1, [PM2.5]TEOM,k-2, [PM2.5]TEOM,k-3, [PM2.5]TEOM,k-4, [PM2.5]TEOM,k-5

A4 A5

Air,k

Fig. 5.14 Probabilities of model candidates updated at each hour based on the measured concentration of Purple Air.

Fig. 5.15 shows the scatterplot (red dots) of the measured PM2.5 concentration by the Purple Air low-cost sensor versus the estimated PM2.5 concentration from the weighted average of five model outputs. It is noted that the blue dots (R2 ¼ 0.95) stay very close to the line of equality (red line), meaning that the performance of the combined model prediction is satisfactory. Due to the availability of public data, the adaptive BMA scheme is only applied to a 1 to 1 correction model in this example. In the real scenario, there will be a sparse network of m TEOM monitors and a dense network of p LCS sensors. Under this circumstance, it

295

296

Air quality monitoring and advanced bayesian modeling

Fig. 5.15 Scatterplot of measured PM2.5 concentration by Purple Air versus the weighted average of PM2.5 concentration by model candidates A1 to A5.

is suggested to simplify the analysis by reducing the dimension of the correlated air quality dataset with the empirical orthogonal function (EOF) analysis (Lorenz, 1956; Hannachi et al., 2010). Assume that we have the concentration field measured by the TEOM monitoring network and it is rearranged in the following format, which is an N m data matrix CT: 3 2 C 11 ⋯ C 1m 7 6 (5.147) ⋱ ⋮ 5 CT ¼ 4 ⋮ CN 1

⋯ C Nm

Each column represents the measured PM2.5 concentrations by the ith TEOM monitor, where i ¼ 1, …,m. Each row represents the spatial map of measured PM2.5 concentrations at any kth step, where k ¼ 1, …,N. We also assume we have the concentration field measured by the low-cost sensor network and it is rearranged as the following N p data matrix CL. 2 3 C 11 ⋯ C 1p 6 7 ⋱ ⋮ 5 (5.148) CL ¼ 4 ⋮ C N 1 ⋯ C Np

Advanced Bayesian air quality forecasting methods

After removing the mean of each column from the matrix CT, each column in the new data matrix CT0 represents the concentration anomalies from the average concentration over the entire period NΔt, where Δt is the sampling time interval. Then, the sample covariance matrix R ¼ (CT T0CT0)/(N 1) is calculated and it is followed by solving the following eigenvalue problem: RS ¼ SΛ

(5.149)

where S5½s1 |…|sm T is the eigenvector matrix of R and each column of S corresponds to an empirical orthogonal mode of CT0. The EOF modes define the new coordinate system to view the concentration anomalies. Each axis of the new coordinate system points in the direction of maximum joint variability of the concentration anomalies. As for the matrix Λ 5 diag(λ1, …, λm), it is the diagonal matrix with the eigenvalues arranged in the descending order. Each eigenvalue λi corresponds to the principal variance of the data explained by the EOF mode si. Therefore the percentage of total variance of CT0 explained by the eigenvectors s1,…,sj is given by: j X

Accumulated percentage explained ¼

i¼1 m X

λi 100%

(5.150)

λk

k¼1

The evolution of each EOF in time is found by the following projection: φi ¼ CT0 si , i ¼ 1,…,m

(5.151)

The projected vector φi is an N 1 column vector which contains the time series of the expansion coefficient or the score of the ith EOF mode at different time steps. Each expansion coefficient φik represents the projection of the spatial map of the concentration anomaly at the kth time step onto the ith EOF mode. The original dataset CT0 can be also reconstructed by summing the product of φisT i for all EOF modes φi, …, φm: CT0 ¼

m X

φi sTi

(5.152)

i¼1

The dimension reduction mentioned at the beginning of performing the EOF analysis is now achieved by truncating unnecessary EOF modes and CT0 is now approximated by the retained EOFs of the first few modes (j ≪ m) that explain the majority of the total variance (e.g., accumulated percentage of 80% calculated by Eq. (5.151): CT0

j X i¼1

φi sTi

(5.153)

297

298

Air quality monitoring and advanced bayesian modeling

The number of retained EOF modes can be also decided by using the Scree plot or the North’s rule of thumb (North et al., 1982; Cattell, 2010). Similarly, the retained EOF modes t1, …, tr can be extracted from the concentration field of the low-cost sensor by performing EOF analysis of CL0. Then, Kalman filter-based model candidates can be proposed to model the expansion coefficients of the LCS based on different combinations of the expansion coefficients of the TEOM monitors. The dynamic Bayesian model averaging is then performed to obtain the weighted averages of the estimated expansion coefficients of the LCS. Finally, the predicted concentrations of the LCS can be obtained by: bL ¼ C

r X

b i tT + M Ψ i

(5.154)

i¼1

b 1,…, ψ b r are the expansion coefficients of the retained EOF modes obtained in where ψ dynamic Bayesian model averaging and M is the mean matrix. Each column in M contains the average concentration subtracted from the concentrations of each LCS in CL.

5.6 Case study 5.6.1 Overview In the previous chapter, different statistical forecasting models of PM2.5 and O3 in Macau were compared based on the same set of input variables. At that time the entire set of input variables was used and some of them may not be necessary. In addition, the model coefficients in the validation phase are assumed to be equal to the estimated coefficients in the training phase. However, this assumption may not hold due to the problem of concept drift. In this case study, the model class selection is revisited again by pruning unnecessary input variables based on the Bayesian approach. Apart from that, we will compare the performance of nonadaptive models and the Kalman filter-based adaptive models in order to examine the importance of model adaptiveness.

5.6.2 Air quality forecasting in Macau with the adaptive linear models According to Table 4.10, there are 17 predictor variables for PM2.5. These variables constitute the building blocks of the model candidates for the MLR-based PM2.5 forecasting model and their different combinations form (217–1 ¼ 131,071) model candidates in total. As for the tropospheric O3, there are 18 predictor variables in Table 4.10 and there are 262,143 combinations of MLR model candidates. The Bayesian approach is applied to select the most probable model class of PM2.5 or O3 from the set of model candidates given the training data from 2016 to 2018. The selected model classes will be validated with the dataset from 2019 to 2020.

Advanced Bayesian air quality forecasting methods

Table 5.7 Variables adopted by the most probable MLR-based forecasting model classes of PM2.5 and the corresponding coefficients.

In this case study, there is no initial preference on any model candidate. The uniform prior probability is assumed so that P(Aj) ¼ 1/131071 for the PM2.5 model candidates A1,…,A131071. Therefore the most probable model class can be chosen from the model with the highest model evidence among the pool of candidates. Table 5.7 presents the selected predictor variables of the most probable PM2.5 forecasting model class at each station. The positive model coefficients are highlighted in yellow, whereas the negative coefficients are highlighted in blue. Approximately 7–8 predictor variables are adopted at each station. The daily maximum of hourly surface temperature Tmax is adopted at the PO and TC stations and the coefficients are negative. The aloft temperature T850 at 850mb is adopted at the remaining stations (TG, EN, CD, and KH) and the coefficients are also negative. During the day, a higher maximum surface temperature implies the air is more buoyant and it favors the vertical mixing of the convective boundary layer. Therefore higher surface maximum temperature has a negative effect on the PM2.5. For the aloft temperature, its effect on PM2.5 can be more easily interpreted when the temperature difference T850-Tref is available. This reference temperature can be the minimum surface temperature. As there is one temperature variable in the model, we skip to interpret the influence of T850 on PM2.5 here. For the relative humidity RH, it is adopted by the models at the TG, PO, and TC stations and it has a negative impact on PM2.5. The reason can be attributed to less secondary organic aerosols (SOA) formed by photochemical reactions when the relative humidity is high. For the geopotential height at 500mb, it

299

300

Air quality monitoring and advanced bayesian modeling

is adopted at all stations and all coefficients are positive. Large H500 is usually associated with upper level ridge and the surface high pressure system. The sinking air suppresses the vertical mixing and enhances the subsidence inversion aloft. For the north–south component V of the surface wind direction, it is adopted at almost all monitoring stations except for the PO station. But the model of the PO station also adopts the north–south component V850 of the aloft wind direction at 850mb. Both coefficients are positive. This implies that it has positive impact on PM2.5 when the northerly wind brings in dry and continental air masses from the upwind cities of the Greater Bay Area. On the contrary, the southerly wind brings in humid and relatively clean sea breezes from the South China Sea and it is beneficial to the local dilution and dispersion. The last few selected predictor variables are the past histories of PM2.5 and all their coefficients are positive. As for the ozone prediction model, the uniform prior probability distribution is also assumed so that P(Aj) ¼ 1/262143 for the O3 model candidates A1,…,A262143. Table 5.8 presents the selected predictor variables of the most probable O3 forecasting model class at each station. In general, 6 to 8 predictor variables are adopted at each station and all model coefficients are positive. The daily maximum of hourly surface temperature Tmax is adopted at all monitoring stations. The coefficient is positive and this variable can be Table 5.8 Variables adopted by the most probable MLR-based forecasting model classes of O3 and the estimated coefficients.

Advanced Bayesian air quality forecasting methods

treated as a surrogate of the photochemistry. The north–south component V850 of the aloft wind direction is adopted at all monitoring stations and its role in the model can reflect the transport of ozone and precursors from the upwind cities to the study area. At the EN station, the north–south component V of the surface wind direction is also chosen. The role of this variable in the model should be similar to that of V850. As for the east–west component U of the surface wind direction adopted at the TC, CD, and KH stations, it may be used to reflect the emissions and transport of ozone precursors from the nearby power plant in the Coloane Island and the incinerator in the Taipa Island. Finally, the past histories of O3 and NO2 concentration are adopted at almost all monitoring stations to reflect the initial conditions of ozone and precursors. Next, the nonadaptive model at each station is turned into the adaptive forecasting model with time-varying model coefficients by using the Kalman filter. Based on the procedures of Fig. 5.5, the initial parameter vector β0, 0 and the initial covariance matrix P0, 0 need to be specified at the beginning. The initial parameter vector at each station is assumed to be equal to the estimated coefficients at the corresponding station in Tables 5.7 or 5.8. The initial covariance matrix is assumed to be a diagonal matrix, where the diagonal elements are proportional to the initial parameter vector. In this case study, a proportional constant of 100 is used. The initial perturbation matrix is also a diagonal matrix with the diagonal elements proportional to the initial parameter vector and this proportional constant is denoted by p0: 3 2 p0 β 1 0 0 … 0 7 6 p0 β2 0 … 0 7 60 7 6 7 6 0 p0 β3 … ⋮ (5.155) Q0 ¼ 6 0 7 7 6 ⋮ ⋮ ⋱ 0 5 4⋮ 0 0 0 … p0 β m where m ranges from 6 to 8 for different stations. The initial measurement noise variance is assumed to be a multiple of the variance of the measured pollutant concentration (PM2.5 or O3) between 2016 and 2018: σ 20 ¼ p1 σ 2time series,20162018

(5.156)

The hyperparameters p0 and p1 are estimated by minimizing the objective function of Eq. (5.68) subjected to the constraints that both p0 and p1 are larger than zero. The minimization is performed by using the “fminsearch” or the “fmincon” function in Matlab. The optimal estimates of p0 and p1 at different stations are presented in Table 5.9. Given the optimal estimates in Table 5.9, the Kalman filter is run again for each station. Figs. 5.16 and 5.17 show the estimated coefficients of the PM2.5 and O3 forecasting models at the TG station during 2019 and 2020. In both figures, there is a time-varying coefficient

301

302

Air quality monitoring and advanced bayesian modeling

Table 5.9 Optimal estimates of noise parameters in adaptive forecasting models based on the data of years 2016–2018.

Fig. 5.16 Time-varying coefficients of the adaptive model for PM2.5 at the TG station.

showing obvious rising trend. In Fig. 5.16, the coefficient β2,k with obvious rising trend corresponds to the aloft temperature T850 in the PM2.5 forecasting model. In Fig. 5.17, the coefficient β1,k with obvious rising trend corresponds to the daily maximum of the hourly surface temperature T in the O3 forecasting model. Fig. 5.18 shows the plots of measured PM2.5 concentrations versus forecasts by the adaptive PM2.5 model at the TG station during 2019 (top) and 2020 (bottom). The

Fig. 5.17 Time-varying coefficients of the adaptive model for O3 at the TG station.

Fig. 5.18 Plots of measured PM2.5 concentrations versus forecasted PM2.5 concentrations by the adaptive model at the TG station.

304

Air quality monitoring and advanced bayesian modeling

Fig. 5.19 Scatterplots of measured PM2.5 concentrations versus forecasted PM2.5 concentrations by the adaptive models.

magenta solid line represents the prediction and the gray solid line represents the measurement. The predicted PM2.5 concentration by the adaptive model captures most of the variation in the measured PM2.5 concentration at the TG station (R2 ¼ 0.74). The performance of the other monitoring stations is also shown in the scatterplots of Fig. 5.19. The majority of the points are distributed at the region near the line of equality. As for the tropospheric O3, the plots of measured O3 concentrations versus forecasts by the adaptive O3 model at the TG station during 2019 (top) and 2020 (bottom) are shown in Fig. 5.20. The performance of the other stations is shown in the scatterplots of Fig. 5.21. It is noted that the O3 forecasting model also captures the majority of the variation in the measured O3 concentrations at each station (R2 ¼ 0.83). The performance of the nonadaptive and the adaptive models for PM2.5 and O3 is compared by using the same set of input variables selected by the Bayesian approach. Same performance measures (R2, RMSE, IA, and MFB) used in the previous chapter are adopted for the comparison presented in Tables 5.10 and 5.11. All performance measures consistently indicate the adaptive models are better than the nonadaptive models, notably the RMSE and the MFB. Therefore we conclude that the adaptiveness can help to improve the problem of concept drift in the statistical air quality models throughout this case study.

Fig. 5.20 Plots of measured O3 concentrations versus forecasted O3 concentrations by the adaptive model at the TG station.

Fig. 5.21 Scatterplots of measured O3 concentrations versus forecasted O3 concentrations by the adaptive models.

306

Air quality monitoring and advanced bayesian modeling

Table 5.10 Performance of nonadaptive and adaptive PM2.5 forecasting models at different monitoring stations of Macau during 2019 and 2020. PM2.5

TG

EN

PO

TC

CD

KH

0.73 6.45 0.92 0.14

0.74 6.99 0.92 0.02

0.75 6.92 0.93 0.05

0.74 7.83 0.91 0.13

0.75 7.05 0.92 0.13

0.78 6.17 0.93 0.12

0.74 6.20 0.92 0.01

0.74 6.84 0.92 0.01

0.75 6.73 0.93 0.01

0.75 7.34 0.93 0.02

0.76 6.59 0.93 0.02

0.78 5.82 0.94 0.01

Nonadaptive

R2 RMSE IA MFB Adaptive

R2 RMSE IA MFB

Table 5.11 Performance of nonadaptive and adaptive O3 forecasting models at different monitoring stations of Macau during 2019 and 2020. O3

TG

EN

TC

CD

KH

0.83 10.23 0.95 0.03

0.84 6.75 0.95 0.09

0.80 7.55 0.94 0.15

0.81 9.19 0.94 0.02

0.81 11.94 0.92 0.07

0.83 9.95 0.95 0.03

0.84 6.59 0.96 0.04

0.79 7.19 0.94 0.04

0.82 8.80 0.95 0.03

0.81 10.60 0.95 0.03

Nonadaptive

R2 RMSE IA MFB Adaptive

R2 RMSE IA MFB

References Bart, M., Williams, D.E., Ainsile, B., McKendry, I., Salmond, J., Grange, S.K., Alavi-Shoshtari, M., Steyn, D., Henshaw, G.S., 2014. High density ozone monitoring using gas sensitive semi-conductor sensors in the lower Fraser Valley, British Columbia. Environ. Sci. Technol. 48, 3970–3977. https://doi. org/10.1021/es404610t. Bayram, F., Ahmed, B.S., Kassler, A., 2022. From concept drift to model degradation: an overview on performance-aware drift detectors. Knowl.-Based Syst. 245, 108632. https://doi.org/10.1016/j. knosys.2022.108632. Beck, J.L., Katafygiotis, L.S., 1998. Updating models and their uncertainties. I: Bayesian statistical framework. J. Eng. Mech. 124, 455–461. https://doi.org/10.1061/(ASCE)0733-9399(1998)124:4(455). Beck, J.L., Yuen, K.V., 2004. Model selection using response measurements: Bayesian probabilistic approach. J. Eng. Mech. 130, 192–203. https://doi.org/10.1061/(ASCE)0733-9399(2004)130:2(192).

Advanced Bayesian air quality forecasting methods

Bellen, A., Zennaro, M., 2003. Numerical Methods for Delay Differential Equations. Oxford University Press, https://doi.org/10.1093/acprof:oso/9780198506546.001.0001. Borrego, C., Costa, A.M., Ginja, J., Amorim, M., Coutinho, M., Karatzas, K., Sioumis, T., Katsifarakis, N., Konstantinidis, K., De Vito, S., Esposito, E., Smith, P., Andre, N., Gerard, P., Francis, L.A., Castell, N., Schneiderg, P., Vianah, M., Minguillo´n, M.C., Reimringer, W., Otjes, R.P., von Sicard, O., Pohle, R., Elen, B., Suriano, D., Pfister, V., Prato, M., Dipinto, S., Penza, M., 2016. Assessment of air quality microsensors versus reference methods: the EuNetAir joint exercise. Atmos. Environ. 147, 246–263. https://doi.org/10.1016/j.atmosenv.2016.09.050. Box, G.E.P., Jenkins, G.M., 1970. Time Series Analysis: Forecasting and Control. Holden-Day. Brown, R.G., Hwang, Y.C., 1996. Introduction to Random Signals and Applied Kalman Filtering. John Wiley & Sons Ltd. Cattell, R.B., 2010. The scree test for the number of factors. Multivar. Behav. Res. 1, 245–276. https://doi. org/10.1207/s15327906mbr0102_10. Chernodub, A.N., 2014. Training neural networks for classification using the extended Kalman filter: a comparative study. Opt. Mem. Neural Netw. 23, 96–103. https://doi.org/10.3103/ S1060992X14020088. Crilley, L.R., Shaw, M., Pound, R., Kramer, L.J., Price, R., Young, S., Lewis, A.C., Pope, F.D., 2018. Evaluation of a low-cost optical particle counter (Alphasense OPC-N2) for ambient air monitoring. Atmos. Meas. Tech. 11, 709–720. https://doi.org/10.5194/amt-11-709-2018. Feenstra, B., Papapostolou, V., Hasheminassab, S., Zhang, H., Boghossian, B.D., Cocker, D., Polidori, A., 2019. Performance evaluation of twelve low-cost PM2.5 sensors at an ambient air monitoring site. Atmos. Environ. 216, 116946. https://doi.org/10.1016/j.atmosenv.2019.116946. Gama, J., Zˇliobait_e, I., Bifet, A., Pechenizkiy, M., Bouchachia, A., 2014. A survey on concept drift adaptation. ACM Comput. Surv. 46, 1–37. https://doi.org/10.1145/2523813. Gamse, S., Zhou, W.H., Tan, F., Yuen, K.V., Oberguggenberger, M., 2018. Hydrostatic-season-time model updating using Bayesian model class selection. Reliab. Eng. Syst. 169, 40–50. https://doi.org/ 10.1016/j.ress.2017.07.018. Gelb, A., 1974. Applied Optimal Estimation. The MIT Press. Genc¸ay, R., Liu, T., 1997. Nonlinear modelling and prediction with feedforward and recurrent networks. Phys. D 108, 119–134. https://doi.org/10.1016/S0167-2789(97)82009-X. Gerboles, M., Spinelle, L., Borowiak, A., 2017. Measuring air pollution with low-cost sensors. European Commission, JRC107461. https://publications.jrc.ec.europa.eu/repository/handle/JRC107461. (Accessed 16 November 2021). Hager, W.W., 1989. Updating the inverse of a matrix. SIAM Rev. 31, 221–239. https://www.jstor.org/ stable/2030425. Hannachi, A., Jolliffe, I.T., Stephenson, D.B., 2010. Empirical orthogonal functions and related techniques in atmospheric science: a review. Int. J. Climatol. 27, 1119–1152. https://doi.org/10.1002/joc.1499. Harris, D.C., 2010. Charles David Keeling and the story of atmospheric CO2 measurements. Anal. Chem. 82, 7865–7870. https://doi.org/10.1021/ac1001492. Hoeting, J.A., Madigan, D., Raftery, A.E., Volinsky, C.T., 1999. Bayesian model averaging: a tutorial. Stat. Sci. 14, 382–401. https://doi.org/10.1214/ss/1009212519. Hoi, K.I., Yuen, K.V., Mok, K.M., 2009. Prediction of daily averaged PM10 concentrations by statistical time-varying model. Atmos. Environ. 43, 2579–2581. https://doi.org/10.1016/j. atmosenv.2009.02.020. Hoi, K.I., Mok, K.M., Yuen, K.V., Pun, M.H., 2013a. Investigation of fine particulate pollution in a coastal city with a mobile monitoring platform. Glob. NEST J. 15, 178–187. https://doi.org/10.30955/ gnj.002538. Hoi, K.I., Yuen, K.V., Mok, K.M., 2013b. Improvement of the multilayer perceptron for air quality modelling through an adaptive learning scheme. Comput. Geosci. 59, 148–155. https://doi.org/10.1016/j. cageo.2013.06.002. Hoi, K.I., Yuen, K.V., Mok, K.M., Miranda, A.I., Ribeiro, I., 2016. Comparison of the Offline and the Online bias Correction of the WRF-EURAD in Porto, Portugal. In: International Congress on Environmental Modelling and Software, p. 56.

307

308

Air quality monitoring and advanced bayesian modeling

Hu, D., Qiao, L., Chen, J., Ye, X., Yang, X., Cheng, T., Fang, W., 2010. Hygroscopicity of inorganic aerosols: size and relative humidity effects on the growth factor. Aerosol Air Qual. Res. 10, 255–264. https://doi.org/10.4209/aaqr.2009.12.0076. Jeffreys, H., 1961. Theory of Probability, third ed. Oxford Clarendon Press. Jenkins, C.R., Peacock, J.A., 2011. The power of Bayesian evidence in astronomy. Mon. Not. R. Astron. Soc. 413, 2895–2905. https://doi.org/10.1111/j.1365-2966.2011.18361.x. Kalman, R.E., 1960. A new approach to linear filtering and prediction problems. Trans. ASME J. Basic Eng. 82, 35–45. https://doi.org/10.1115/1.3662552. Kalman, R.E., Bucy, R.S., 1961. New results in linear filtering and prediction theory. Trans. ASME J. Basic Eng. 83, 95–107. https://doi.org/10.1115/1.3658902. Keeling, C.D., Piper, S.C., Bacastow, R.B., Wahlen, M., Whorf, T.P., Heimann, M., Meijer, H.A., 2005. Atmospheric CO2 and 13CO2 Exchange with the Terrestrial Biosphere and Oceans from 1978 to 2000: Observations and Carbon Cycle Implications. EPRINTS-BOOK-TITLE University of Groningen, Centre for Isotope Research. Kong, Q., Siauw, T., Bayen, A., 2020. Python Programming and Numerical Methods, a Guide for Engineers and Scientists. Elsevier Science, p. 480, https://doi.org/10.1016/C2018-0-04165-1. Kovac, J., Weisberg, M., 2012. Roald Hoffmann on the Philosophy, Art, and Science of Chemistry. Oxford University Press. Lagarias, J.C., Reeds, J.A., Wright, M.H., Wright, P.E., 1998. Convergence properties of the Nelder-Mead simplex method in low dimensions. SIAM J. Optim. 9, 112–147. https://doi.org/10.1137/ S1052623496303470. Lai, X., Yang, T., Wang, Z., Chen, P., 2019. IoT implementation of Kalman filter to improve accuracy of air quality monitoring and prediction. Appl. Sci. 9, 1831. https://doi.org/10.3390/app9091831. Lary, D.J., Mussa, H.Y., 2004. Using an extended Kalman filter algorithm for feed-forward neural networks to describe tracer concentrations. Atmos. Chem. Phys. Discuss. 4, 3653–3667. https://doi.org/10.5194/ acpd-4-3653-2004. Lazzu´s, J.A., Salfate, I., 2017. Long-term prediction of wind speed in La Serena City (Chile) using hybrid neural network. Earth Sci. Res. J. 21, 29–35. https://doi.org/10.15446/esrj.v21n1.50337. Lee, S., Park, S.H., 2022. Concept drift modeling for robust autonomous vehicle control systems in timevarying traffic environments. Expert Syst. Appl. 190, 116206. https://doi.org/10.1016/j. eswa.2021.116206. Li, Z., Fung, C.H., Lau, K.H., 2018. High spatiotemporal characterization of on-road PM2.5 concentrations in high-density urban areas using mobile monitoring. Build. Environ. 143, 196–205. https://doi.org/ 10.1016/j.buildenv.2018.07.014. Li, W., Teng, X., Chen, X., Liu, L., Xu, L., Zhang, J., Wang, Y., Zhang, Y., Shi, Z., 2021. Organic coating reduces hygroscopic growth of phase-separated aerosol particles. Environ. Sci. Technol. 55, 16339–16346. https://doi.org/10.1021/acs.est.1c05901. Liang, L., 2021. Calibrating low-cost sensors for ambient air monitoring: techniques, trends, and challenges. Environ. Res. 197, 111163. https://doi.org/10.1016/j.envres.2021.111163. Lindfield, G., Penny, J., 2019. Numerical Methods Using Matlab. Academic Press, https://doi.org/ 10.1016/C2016-0-00395-9. Liu, F., Wang, M., Zhang, M., 2021. Effects of COVID-19 lockdown on global air quality and health. Sci. Total Environ. 755, 142533. https://doi.org/10.1016/j.scitotenv.2020.142533. Lo´pez-Caraballo, C.H., Salfate, I., Lazzu´s, J.A., Rojas, P., Rivera, M., Palma-Chilla, L., 2016. Mackey-Glass noisy chaotic time series prediction by a swarm-optimized neural network. J. Phys. Conf. Ser. 720, 012002. https://doi.org/10.1088/1742-6596/720/1/012002. Lorenz, E.N., 1956. Empirical Orthogonal Functions and Statistical Weather Prediction. Scientific Report of Statistical Forecasting Project. MIT, Cambridge, Massachusetts, p. 49. https://eapsweb.mit.edu/sites/ default/files/Empirical_Orthogonal_Functions_1956.pdf. (Accessed 10 May 2022). L€ uthi, D., Le Floch, M., Bereiter, B., Blunier, T., Barnola, J.M., Siegenthaler, U., Raynaud, D., Jouzel, J., Fischer, H., Kawamura, K., Stocker, T.F., 2008. High-resolution carbon dioxide concentration record 650,000–800,000 years before present. Nature 453, 379–382. https://doi.org/ 10.1038/nature06949.

Advanced Bayesian air quality forecasting methods

Mackay, J.C., 1992. Bayesian interpolation. Neural Comput. 4, 415–447. https://doi.org/10.1162/ neco.1992.4.3.415. Mackey, M.C., Glass, L., 1977. Oscillation and chaos in physiological control systems. Science 197, 287–289. https://doi.org/10.1126/science.267326. Mok, K.M., Miranda, A.I., Yuen, K.V., Hoi, K.I., Monteiro, A., Ribeiro, I., 2017. Selection of bias correction models for improving the daily PM10 forecasts of WRF-EURAD in Porto, Portugal. Atmos. Pollut. Res. 8, 628–639. https://doi.org/10.1016/j.apr.2016.12.010. Mok, K.M., Yuen, K.V., Hoi, K.I., Chao, K.M., 2018. Predicting ground-level concentrations by adaptive Bayesian model averaging of statistical seasonal models. Stoch. Environ. Res. Risk Assess. 32, 1283–1297. https://doi.org/10.1007/s00477-017-1473-1. Moreno-Torres, J.G., Raeder, T., Alaiz-Rodriguez, R., Chawla, N.V., Herrera, F., 2012. A unifying view on dataset shift in classification. Pattern Recogn. 45, 521–530. https://doi.org/10.1016/j. patcog.2011.06.019. Nelder, J.A., Mead, R., 1965. A simplex method for function minimization. Comp. J. 7, 308–313. https:// doi.org/10.1093/comjnl/7.4.308. North, G.R., Bell, T.L., Cahalan, R.F., Moeng, F.J., 1982. Sampling errors in the estimation of empirical orthogonal functions. Mon. Weather Rev. 110, 699–706. https://doi.org/10.1175/1520-0493(1982) 1102.0.CO;2. Prakash, G., Balomenos, G.P., 2021. A Bayesian approach to model selection and averaging of hydrostaticseason-temperature-time model. Structure 33, 4359–4370. https://doi.org/10.1016/j. istruc.2021.06.109. Putaud, J.P., Pozzoli, L., Pisoni, E., Santos, S.M.D., Lagler, F., Lanzani, G., Santo, U.D., Colette, A., 2021. Impacts of the COVID-19 lockdown on air pollution at regional and urban background sites in northern Italy. Atmos. Chem. Phys. 21, 7597–7609. https://doi.org/10.5194/acp-21-7597-2021. Qiao, J., Li, S., Han, H., Wang, D., 2017. An improved algorithm for building self-organizing feedward neural networks. Neurocomputing 262, 28–40. https://doi.org/10.1016/j.neucom.2016.12.092. Raftery, A.E., Ka´rny´, M., Ettler, P., 2010. Online prediction under model uncertainty via dynamic model averaging. Application to a cold rolling mill. Technometrics 52, 52–66. https://doi.org/10.1198/ TECH.2009.08104. Ridder, K.D., Kumar, U., Lauwaet, D., Blyth, L., Lefebvre, W., 2012. Kalman filter-based air quality forecast adjustment. Atmos. Environ. 50, 381–384. https://doi.org/10.1016/j.atmosenv.2012.01.032. Romeo, G., 2020. Elements of Numerical Mathematical Economics with Excel. Academic Press, https:// doi.org/10.1016/C2018-0-02476-7. Roussel, M.R., 2018. The Mackey-Glass models, 40 years later. Biomath. Commun. 5, 140–158. https:// doi.org/10.11145/bmc.2018.10.277. Samadi, S., Pourreza-Bilondi, M., Wilson, C.A.M.E., Hitchcock, D.B., (2020) Bayesian model averaging with fixed and flexible priors: theory, concepts, and calibration experiments for rainfall-runoff modelling. J. Adv. Model. Earth Syst. 12, 28pp. doi:https://doi.org/10.1029/2019MS001924. Sa´nchez-Balseca, J., Perez-Forguet, A., 2020. Modelling hourly spatio-temporal PM2.5 concentration in wildfire scenarios using dynamic linear models. Atmos. Res. 242, 104999. https://doi.org/10.1016/j. atmosres.2020.104999. Singhal, S., Wu, L., 1989. Training feed-forward networks with the extended Kalman algorithm. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing ICASSP-89. Glasgow, Scotland, pp. 1187–1190, https://doi.org/10.1021/es404610t. Sivia, D.S., 1996. Data Analysis: A Bayesian Tutorial. Oxford Science Publications. Sum, J., Leung, C., Young, G.H., Kan, W., 1999. On the Kalman filtering method in neural network training and pruning. IEEE Trans. Neural Netw. 10, 161–166. https://doi.org/10.1109/72.737502. Taghizadeh-Mehrjardi, R., Khademi, H., Khayamim, F., Zeraatpisheh, M., Heung, B., Scholten, T., 2022. A comparison of model averaging techniques to predict the spatial distribution of soil properties. Remote Sens. (Basel) 14. https://doi.org/10.3390/rs14030472. 16 pp. United Nations framework convention on climate change, UNFCCC, 1998. Kyoto Protocol to the United Nations Framework Convention on Climate Change. http://unfccc.int/resource/docs/convkp/kpeng. pdf. (Accessed 1 July 2021).

309

310

Air quality monitoring and advanced bayesian modeling

United Nations framework convention on climate change, UNFCCC, 2012. Doha Amendment to Kyoto’s Protocol. https://treaties.un.org/doc/Publication/CN/2012/CN.718.2012-Eng.pdf. (Accessed 1 July 2021). United Nations framework convention on climate change, UNFCCC, 2015. Paris Agreement. https:// unfccc.int/sites/default/files/english_paris_agreement.pdf. (Accessed 1 July 2021). Wang, J., Tang, L., Luo, Y., Ge, P., 2017. A weighted EMD-based prediction model based on TOPSIS and feed forward neural network for noised time series. Knowl. Based Syst. 132, 167–178. https://doi.org/ 10.1016/j.knosys.2017.06.022. Yang, Y., Zhao, J., Song, J., Wu, J., Zhao, C., Leng, H., 2022. A hybrid method using HAVOK analysis and machine learning for predicting chaotic time series. Entropy 24, 408. https://doi.org/10.3390/ e24030408. Yatkin, S., Gerboles, M., Belis, C.A., Karagulian, F., Lagler, F., Barbiere, M., Borowiak, A., 2020. Representativeness of an air quality monitoring station for PM2.5 and source apportionment over a small urban domain. Atmos. Pollut. Res. 11, 225–233. https://doi.org/10.1016/j.apr.2019.10.004. Yu, H., Bang, S., 1997. An improved time series prediction by applying the layer-by-layer learning method to FIR neural networks. Neural Netw. 10, 1717–1729. https://doi.org/10.1016/S0893-6080(97) 00066-X. Yuen, K.V., 2010. Bayesian Methods for Structural Dynamics and Civil Engineering. John Wiley & Sons. Yuen, K.V., Katafygiotis, L.S., 2001. Bayesian time-domain approach for model updating using ambient data. Probab. Eng. Mech. 16, 219–231. https://doi.org/10.1016/S0266-8920(01)00004-2. Yuen, K.V., Kuok, S.C., 2010. Modeling of environmental influence in structural health assessment for reinforced concrete buildings. Earthq. Eng. Eng. Vib. 9, 295–306. https://doi.org/10.1007/s11803-0100014-4. Yuen, K.V., Hoi, K.I., Mok, K.M., 2007. Selection of noise parameters for Kalman filter. Earthq. Eng. Eng. Vib. 6, 48–56. https://doi.org/10.1007/s11803-007-0659-9. Zhao, J., Yu, X., 2015. Adaptive natural gradient learning algorithms for Mackey-Glass chaotic time prediction. Neurocomputing 157, 41–45. https://doi.org/10.1016/j.neucom.2015.01.039.

Index Note: Page numbers followed by f indicate figures and t indicate tables.

A Adaptive linear models, 298–305, 299–300t, 302–305f, 302t Advanced Bayesian air quality forecasting methods adaptive Bayesian model averaging, 287–291, 291–292t, 293–296f, 295t air quality forecasting in Macau, 298–305, 299–300t, 302–305f, 302t Bayesian model class selection of, 250–260 Bayes’ theorem, 253 case studies, 298–305 Kalman filter-based adaptive air quality model, 261–273 model adaptiveness, 248–250 model complexity, 246–248 adaptive Bayesian model averaging time-varying multilayer perceptron, 273–285 Aerosol chemical speciation monitor (ACSM), 48–49 Aerosol mass spectrometer (AMS), 47 Aerosol time-of-flight mass spectrometer (ATOFMS), 50–51 Air-borne air quality monitoring, 140–159 balloon-borne measurements, 141–144, 142t manned-aircraft measurements, 144–152, 145f, 147–148t other mobile measurement platforms, 159 unmanned-aircraft measurements, 152–159, 154–155f, 157f Air pollutant forecasting of, 8–10 modeling, 8–10 particulate matters, 37–45, 38–39f polar stratospheric clouds (PSC), 4 sources and impacts of, 3–6 VOCs, 6 Air quality forecasting in Macau, 298–305, 299–300t, 302–305f, 302t Air quality index (AQI), 1 Air quality monitoring methods, 6–8 criteria air pollutant methods carbon monoxide (CO), 14–19, 16f

Fourier transform infrared radiometry (FTIR), 18–19 gas filter correlation (GFC), 17–18, 17f nitrogen oxides (NOx), 24–30 optical parametric oscillator (OPO), 24 ozone (O3), 30–37 particulate matters, 37–45, 38–39f sulfur dioxide, 19–24 ultraviolet photometry (UVP) method, 32–33 low-cost sensors (LCS), 105–135 analytical merits, 126–129, 128t data correction, 132–135 electrochemical sensors, 107–112, 109f, 111t field comparisons, 131–132 lab calibrations and, 131–132 light emitting diode (LED), 120–121 low-pulse occupancy (LPO) method, 120–121 metal oxide sensors, 112–117, 114f new considerations, 126–135 PM, optical sensors for, 117–123, 119f potential interferences, 129–130 VOCs, sensors for, 123–125, 124f mobile measurement platforms, 135–159 air-borne air quality monitoring, 140–159 on-road air quality monitoring, 135–140 real-time chemical composition monitoring, 45–81 particulate matters (PM), 46–61 real-time measurements of gases, optical techniques for, 69–74, 72f thermal and optical techniques for, 74–81 volatile organic compounds (VOCs), 61–69 Air quality stations (AQS), 29 Ammonia, 2–3 Amperometric gas sensor (AGS), 108 Analytical merits, 126–129, 128t Arctic Research of the Composition of the Troposphere from Aircraft and Satellites (ARCTAS), 146–149 Atmospheric Boundary Layer Experiments (ABLE), 145–146 Atmospheric Radiation Measurement (ARM) program, 145–146

311

312

Index

B

D

Balloon-borne measurements, 141–144, 142t Bayes’ theorem, 253 Beer-Lambert law, 72 Black carbon (BC), 74–75

Data correction, 132–135 Decision trees (DT), 133 Differential absorption LIDAR (DIAL) systems, 36 Differential mobility analyzer (DMA), 150–151 Differential optical absorption spectroscopic (DOAS), 36 Diffusion size classifiers (DiSCs), 117–118 Discrete grab sampling, 15

C Carbon monoxide (CO), 2–3, 14–19, 16f Catalytic conversion, 28 Cavity attenuated phase shift (CAPS), 30 Cavity ring-down spectroscopy (CRDS), 18–19, 69–70 Centered root-mean-square error (CRMSE), 127 Chemical ionization mass spectrometry (CIMS), 150 Chemiluminescence, 26, 27f Chlorinated and fluorinated carbons (CFCs), 4 Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container (CARIBIC), 149–150 Classification and regression tree (CART) bagging and random forests, 194–195 classification tree, 192–194 regression tree, 187–192, 189f Clean air ammonia, 2–3 carbon monoxide (CO), 2–3 nitrogen oxides, 2–3 polluted air vs., 1–3 volatile organic compounds (VOCs), 2–3 CO2 emissions, 195–198 Computational fluid dynamic (CFD), 156–158 Condensation particle counter (CPC), 117–118, 150–151 Counter electrode (CE), 108–110 Criteria air pollutant methods carbon monoxide (CO), 14–19, 16f Fourier transform infrared radiometry (FTIR), 18–19 gas filter correlation (GFC), 17–18, 17f nitrogen oxides (NOx), 24–30 optical parametric oscillator (OPO), 24 ozone (O3), 30–37 particulate matters, 37–45, 38–39f sulfur dioxide, 19–24 ultraviolet photometry (UVP) method, 32–33 Cyclones, 40

E Electrochemical (EC) sensors, 107–112, 109f, 111t Electrode (WE), 108–110 Electron capture detection (ECD), 14–16 Energy dispersive X-ray fluorescence (EDXRF), 56–57 Environmental Technology Verification (ETV) program, 57 Extractive electrospray ionization coupled with time-of-flight mass spectrometry (EESI-MS), 150–151

F Fast mobility particle sizer (FMPS), 117–118, 218–219 Federal equivalent method (FEM), 13–15, 31 Federal reference method (FRM), 13–14, 31 Flame ionization detection (FID), 14 Flame photometric detection (FPD), 19–20 Forced harmonic oscillator (FHO) model, 42 Fourier transform infrared radiometry (FTIR), 14, 18–19

G Gaseous elemental mercury (GEM), 34 Gas filter correlation (GFC), 17–18, 17f Gas-phase chemiluminescence method, 31 Geiger-Muller detectors, 44–45 Global Tropospheric Experiment (GTE), 145–146 Greedy algorithm, 190 Ground-level O3 concentrations of Macau, 221–238, 222–224t, 222f, 226t, 227–228f, 229–231t, 232–233f, 234–235t, 236–237f

H Helium (He), 141 Hessian matrix, 252–253 High-altitude and long-range (HALO), 151–152

Index

Hong Kong Environmental Protection Department (HKEPD), 206–209 Hygroscopic tandem DMA (HTDMA), 150–151

I Indium gallium arsenide (InGaAs), 18 Influence of Pollution on Aerosols and Cloud Microphysics in North China (IPAC-NC) project, 146–149 Initial covariance matrix, 270–271 Integrated cavity output spectroscopy (ICOS), 18–19 Intercontinental Chemical Transport Experiment (INTEX), 145–146 International Consortium for Atmospheric Research on Transport and Transformation (ICARTT), 146–149 Intracavity laser absorption spectroscopy (ICLAS), 69–70 Ion mobility spectrometer (IMS), 123–124

K Kalman filter-based adaptive air quality model, 261–273 Keeling curve, 256–260, 257f, 258t, 260t, 261f K-fold cross-validation, 191–192

L Laser-induced fluorescence (LIF), 20, 24 LASSO regression, 179–184, 184f Lead selenide (PbSe), 18 Light detection and ranging (LiDAR), 140–141 Light emitting diode (LED), 120–121 Likelihood, 174–176, 248–249, 251–252, 254–257, 268–269, 288 Limiting-current theory, 110–111 Linear regression model, 250–260 Lithium polymer (LiPo), 153–156 Long-path absorption photometry (LOPAP), 36–37 Low-cost sensors (LCS), 105–135 analytical merits, 126–129, 128t data correction, 132–135 electrochemical sensors, 107–112, 109f, 111t field comparisons, 131–132 lab calibrations and, 131–132 light emitting diode (LED), 120–121 low-pulse occupancy (LPO) method, 120–121 metal oxide sensors, 112–117, 114f

new considerations, 126–135 PM, optical sensors for, 117–123, 119f potential interferences, 129–130 VOCs, sensors for, 123–125, 124f Low-pulse occupancy (LPO) method, 120–121

M Macau Meteorological and Geophysical Bureau (SMG), 221–223 Machine-learning-based approaches, 133 Mackey–Glass time series, 283–285, 284–285f Manned-aircraft measurements, 144–152, 145f, 147–148t Manua Loa Observatory of Hawaii, 270 Mean absolute error (MAE), 127 Mean absolute percentage error (MAPE), 127 Mean bias error (MBE), 127 Mean relative error (MRE), 127 Mean square error (MSE), 127 Metal oxide semiconducting, 112 Metal oxide sensors, 112–117, 114f Methyl sulfonic acid (MSA), 159 Micro-orifice uniform deposit impactor (MOUDI), 42–43, 56 Mie scattering (MS), 70 Mobile measurement platforms air-borne air quality monitoring, 140–159 balloon-borne measurements, 141–144, 142t manned-aircraft measurements, 144–152, 145f, 147–148t other mobile measurement platforms, 159 unmanned-aircraft measurements, 152–159, 154–155f, 157f on-road air quality monitoring, 135–140 monitoring method and data analysis, requirements on, 140 nonpowered and nonfixed-route platforms, 139–140 powered and fixed-route vehicles, 138–139, 139f powered and nonfixed-route vehicles, 136–138, 137f Model adaptiveness, 248–250 Model complexity, 246–248 MODIS AOD products, 179 Monitoring method and data analysis, requirements on, 140 Multilayer perceptron, 199–209

313

314

Index

Multiple linear regression (MLR) LASSO regression, 179–186, 184f ridge regression, 179–186 Multiple time-varying regression models, adaptive Bayesian model averaging of, 286–287

N Nano air vehicle (NAV), 153 National Aeronautics and Space Administration (NASA), 145–146 National Center for Atmospheric Research (NCAR), 144 Neph-type Shinyei PPD42NS sensor, 121–122 Neutron activation analysis (INAA), 56–57 New particle formation (NPF), 144 Nitric oxide (NO), 22–24 Nitrogen oxides, 2–4 Nondispersive infrared (NDIR), 75 Nonmethane Hydrocarbon Intercomparison Experiment (NOMHICE), 61–62 Nonmethane hydrocarbons (NMHCs), 6 Nonpowered and nonfixed-route platforms, 139–140 Normalized mean-square error (NMSE), 127

O Ockham factor, 255 Onboard manned aircrafts, 150 On-road air quality monitoring, 135–140 monitoring method and data analysis, requirements on, 140 nonpowered and nonfixed-route platforms, 139–140 powered and fixed-route vehicles, 138–139, 139f powered and nonfixed-route vehicles, 136–138, 137f Optical parametric oscillator (OPO), 24 Organic aerosols (OA), 52 Ozone (O3), 30–37

aerosol chemical speciation monitor (ACSM), 48–49 ion chromatographic systems gases and particles, 55–56 particles only, 53–55 mass spectrometry based on electron impact (EI), 47–50 mass spectrometry based on laser ionization desorption (LDI), 50–52, 51t optical sensors for, 117–123, 119f real-time PM measurement ion chromatography for, 53–56 mass spectrometry for, 46–52 trace elements, real-time measurement of, 56–61, 59–60t Pellistor-based sensors, 125 Photochemical Assessment Monitoring Station (PAMS), 62–63 Photoionization detection (PID), 123–124 PM2.5, 221–238, 222–224t, 222f, 226t, 227–228f, 229–231t, 232–233f, 234–235t, 236–237f Polar stratospheric clouds (PSC), 4 Polyvinylidene fluoride (PVDF), 35 Portional integral derivative (PID) control, 55 Positive matrix factorization (PMF), 58 Potential interferences, 129–130 Powered and fixed-route vehicles, 138–139, 139f Powered and nonfixed-route vehicles, 136–138, 137f Probability density function (PDF), 174–175 Proton-transfer-reaction mass spectrometry (PTR-MS), 64

Q Quadrupole (QMS), 65–66 Quadrupole aerosol mass spectrometer (Q-AMS), 47 Quantum cascade laser (QCL), 18–19 Quartz crystal microbalance (QCM), 42–43

P

R

Pacific Exploratory Missions (PEM), 145–146 Particle-into-liquid sample (PILS), 150–151 Particle number concentration (PNC), 140 Particle size distribution (PSD), 140 Particulate matters (PM), 2–3, 5, 10–11, 37–61, 38–39f

Random forests (RF), 133 Rayleigh regime, 118 Rayleigh scattering (RS), 70 Real-time chemical composition monitoring, 45–81 particulate matters (PM), 46–61

Index

real-time measurements of gases, optical techniques for, 69–74, 72f thermal and optical techniques for, 74–81 volatile organic compounds (VOCs), 61–69 Reducing compound photometer (RCP), 14–15 Reference electrode (RE), 108–110 Regression tree, 187–192, 189f Residual gas analyzer (RGA), 48–49 Residual sum of squares (RSS), 191 Ridge regression, 179–184 Root-mean-square error (RMSE), 127, 132, 248f

classification and regression tree (CART), 186–198 multilayer perceptron, 199–209 multiple linear regression (MLR), 173–186 support vector regression (SVR), 209–220 Traditional statistical models, 9–10 Transport and Atmospheric Chemistry in the Atlantic (TRACE), 145–146 Tunable diode laser absorption spectroscopy (TDLAS), 30 Tunable diode laser spectrometry (TDLS), 18–19

S

U

Scanning mobility particle sizer (SMPS), 117–118, 150–151, 218–219 Schottky barrier behavior, 114 Secondary inorganic aerosol (SIA), 3–4 Secondary organic aerosol (SOA), 45–46 Sherman–Morrison formula, 265 Silicon drift detector (SDD), 57 Silver oxide (AgO), 15 Simple harmonic oscillator (SHO), 42 SPAMS, 52 Spectroscopic method, 25–26 Sulfur dioxide, 19–24 Support vector regression (SVR), 133, 209–220

T Tapered element oscillating microbalance (TEOM), 41–42, 45 Thermal conductivity detector (TCD), 123–124 Time resolution, 150 Time-varying multilayer perceptron (TVMLP), 273–285 Total optical reflectance (TOR), 75 Total organic carbon (TOC), 150–151 Total reflection X-ray fluorescence (TRXRF), 56–57 Traditional statistical air quality forecasting methods

Ultraviolet photometry (UVP) method, 32–33 Uninterrupted power supply (UPS), 136 Unmanned aerial vehicles (UAVs), 152–153 Unmanned-aircraft measurements, 152–159, 154–155f, 157f URG ambient ion monitor (URG-AIM), 53

V Vacuum ultraviolet resonance fluorescence (VUV-RF), 18–19 Volatile organic compounds (VOCs), 2–3, 10–11, 61–69 real-time measurement gas chromatography, 62–63 mass spectrometry, 63–69, 65f sensors for, 123–125, 124f Volcano eruption, 5

W Wet annular denuders (WAD), 53, 55 Williams, David E., 116 Woods Hole Oceanographic Institute (WHOI), 159

Y Yttria-stabilized zirconia (YSZ), 108

315