Advances in Streamflow Forecasting: From Traditional to Modern Approaches [1 ed.] 012820673X, 9780128206737

Advances in Streamflow Forecasting: From Traditional to Modern Approaches covers the three major data-driven approaches

228 79 28MB

English Pages 404 [386] Year 2021

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Front-Matter_2021_Advances-in-Streamflow-Forecasting
Advances in Streamflow Forecasting
Copyright_2021_Advances-in-Streamflow-Forecasting
Copyright
Dedication_2021_Advances-in-Streamflow-Forecasting
Dedication
Contributors_2021_Advances-in-Streamflow-Forecasting
Contributors
About-the-editors_2021_Advances-in-Streamflow-Forecasting
About the editors
Foreword_2021_Advances-in-Streamflow-Forecasting
Foreword
Preface_2021_Advances-in-Streamflow-Forecasting
Preface
Acknowledgment_2021_Advances-in-Streamflow-Forecasting
Acknowledgment
Chapter-1---Streamflow-forecasting--overview-of-ad_2021_Advances-in-Streamfl
1. Streamflow forecasting: overview of advances in data-driven techniques
1.1 Introduction
1.2 Measurement of streamflow and its forecasting
1.3 Classification of techniques/models used for streamflow forecasting
1.4 Growth of data-driven methods and their applications in streamflow forecasting
1.4.1 Time series modeling
1.4.2 Artificial neural network
1.4.3 Other AI techniques
1.4.4 Hybrid data-driven techniques
1.5 Comparison of different data-driven techniques
1.6 Current trends in streamflow forecasting
1.7 Key challenges in forecasting of streamflows
1.8 Concluding remarks
References
Chapter-2---Streamflow-forecasting-at-large-time-s_2021_Advances-in-Streamfl
2. Streamflow forecasting at large time scales using statistical models
2.1 Introduction
2.2 Overview of statistical models used in forecasting
2.2.1 Forecasting in general
2.2.1.1 ARIMA models
2.2.1.2 Exponential smoothing models
2.2.1.3 General literature
2.2.1.4 Literature in hydrology
2.3 Theory
2.3.1 ARIMA models
2.3.1.1 Definition
2.3.1.2 Forecasting with ARIMA models
2.3.2 Exponential smoothing models
2.4 Large-scale applications at two time scales
2.4.1 Application 1: multi-step ahead forecasting of 270 time series of annual streamflow
2.4.2 Application 2: multi-step ahead forecasting of 270 time series of monthly streamflow
2.5 Conclusions
Conflicts of interest
Acknowledgment
References
Chapter-3---Introduction-of-multiple-multivariate-linea_2021_Advances-in-Str
3. Introduction of multiple/multivariate linear and nonlinear time series models in forecasting streamflow process
3.1 Introduction
3.1.1 Review of MLN time series models
3.2 Methodology
3.2.1 VAR/VARX model
3.2.2 Model building procedure
3.2.3 MGARCH model
3.2.3.1 Diagonal VECH model
3.2.3.2 Testing conditional heteroscedasticity
3.2.4 Case study
3.3 Application of VAR/VARX approach
3.3.1 The VAR model
3.3.2 The VARX model
3.4 Application of MGARCH approach
3.5 Comparative evaluation of models’ performances
3.6 Conclusions
References
Chapter-4---Concepts--procedures--and-applications-of-_2021_Advances-in-Stre
4. Concepts, procedures, and applications of artificial neural network models in streamflow forecasting
4.1 Introduction
4.2 Procedure for development of artificial neural network models
4.2.1 Structure of artificial neural network models
4.2.1.1 Neurons and connection formula
4.2.1.2 Transfer function
4.2.1.3 Architecture of neurons
4.2.2 Network training processes
4.2.2.1 Unsupervised training method
4.2.2.2 Supervised training method
4.2.3 Artificial neural network to approximate a function
4.2.3.1 Step 1: preprocessing of data
4.2.3.1.1 Data normalization techniques
4.2.3.1.2 Principal component analysis
4.2.3.2 Step 2: choosing the best network architecture
4.2.3.3 Step 3: postprocessing of data
4.3 Types of artificial neural networks
4.3.1 Multilayer perceptron neural network
4.3.2 Static and dynamic neural network
4.3.3 Statistical neural networks
4.4 An overview of application of artificial neural network modeling in streamflow forecasting
References
Chapter-5---Application-of-different-artificial-neu_2021_Advances-in-Streamf
5. Application of different artificial neural network for streamflow forecasting
5.1 Introduction
5.2 Development of neural network technique
5.2.1 Multilayer perceptron
5.2.2 Recurrent neural network
5.2.3 Long short-term memory network
5.2.4 Gated recurrent unit
5.2.5 Convolutional neural network
5.2.6 WaveNet
5.3 Artificial neural network in streamflow forecasting
5.4 Application of ANN: a case study of the Ganges River
5.5 ANN application software and programming language
5.6 Conclusions
5.7 Supplementary information
References
Chapter-6---Application-of-artificial-neural-network-an_2021_Advances-in-Str
6. Application of artificial neural network and adaptive neuro-fuzzy inference system in streamflow forecasting
6.1 Introduction
6.2 Theoretical description of models
6.2.1 Artificial neural network
6.2.2 Adaptive neuro-fuzzy inference system
6.3 Application of ANN and ANFIS for prediction of peak discharge and runoff: a case study
6.3.1 Study area description
6.3.2 Methodology
6.3.2.1 Principal component analysis
6.3.2.2 Artificial neural network
6.3.2.3 Adaptive neuro-fuzzy inference system
6.3.2.4 Assessment of model performance by statistical indices
6.3.2.5 Sensitivity analysis
6.4 Results and discussion
6.4.1 Results of ANN modeling
6.4.2 Results of ANFIS modeling
6.5 Conclusions
References
Chapter-7---Genetic-programming-for-streamflow-forecast_2021_Advances-in-Str
7. Genetic programming for streamflow forecasting: a concise review of univariate models with a case study
7.1 Introduction
7.2 Overview of genetic programming and its variants
7.2.1 Classical genetic programming
7.2.2 Multigene genetic programming
7.2.3 Linear genetic programming
7.2.4 Gene expression programming
7.3 A brief review of the recent studies
7.4 A case study
7.4.1 Study area and data
7.4.2 Criteria for evaluating performance of models
7.5 Results and discussion
7.6 Conclusions
References
Chapter-8---Model-tree-technique-for-streamflow-forecas_2021_Advances-in-Str
8. Model tree technique for streamflow forecasting: a case study in sub-catchment of Tapi River Basin, India
8.1 Introduction
8.2 Model tree
8.3 Model tree applications in streamflow forecasting
8.4 Application of model tree in streamflow forecasting: a case study
8.4.1 Study area
8.4.2 Methodology
8.5 Results and analysis
8.5.1 Selection of input variables
8.5.2 Model configuration
8.5.3 Model calibration and validation
8.5.4 Sensitivity analysis of model configurations towards model performance
8.5.4.1 Influence of input variable combinations
8.5.4.2 Influence of model tree variants
8.5.4.3 Influence of data proportioning
8.5.5 Selection of best-fit model for streamflow forecasting
8.6 Summary and conclusions
Acknowledgments
References
Chapter-9---Averaging-multiclimate-model-prediction-_2021_Advances-in-Stream
9. Averaging multiclimate model prediction of streamflow in the machine learning paradigm
9.1 Introduction
9.2 Salient review on ANN and SVR modeling for streamflow forecasting
9.3 Averaging streamflow predicted from multiclimate models in the neural network framework
9.4 Averaging streamflow predicted by multiclimate models in the framework of support vector regression
9.5 Machine learning–averaged streamflow from multiple climate models: two case studies
9.6 Conclusions
References
Chapter-10---Short-term-flood-forecasting-using-artific_2021_Advances-in-Str
10. Short-term flood forecasting using artificial neural networks, extreme learning machines, and M5 model tree
10.1 Introduction
10.2 Theoretical background
10.2.1 Artificial neural networks
10.2.2 Extreme learning machines
10.2.3 M5 model tree
10.3 Application of ANN, ELM, and M5 model tree techniques in hourly flood forecasting: a case study
10.3.1 Study area and data
10.3.2 Methodology
10.4 Results and discussion
10.5 Conclusions
References
Chapter-11---A-new-heuristic-model-for-monthly-streamf_2021_Advances-in-Stre
11. A new heuristic model for monthly streamflow forecasting: outlier-robust extreme learning machine
11.1 Introduction
11.2 Overview of extreme learning machine and multiple linear regression
11.2.1 Extreme learning machine model and its extensions
11.2.2 Multiple linear regression
11.3 A case study of forecasting streamflows using extreme machine learning models
11.3.1 Study area
11.4 Applications and results
11.5 Conclusions
References
Chapter-12---Hybrid-artificial-intelligence-model_2021_Advances-in-Streamflo
12. Hybrid artificial intelligence models for predicting daily runoff
12.1 Introduction
12.2 Theoretical background of MLP and SVR models
12.2.1 Support vector regression model
12.2.2 Multilayer perceptron neural network model
12.2.3 Grey wolf optimizer algorithm
12.2.4 Whale optimization algorithm
12.2.5 Hybrid MLP neural network model
12.2.6 Hybrid SVR model
12.3 Application of hybrid MLP and SVR models in runoff prediction: a case study
12.3.1 Study area and data acquisition
12.3.2 Gamma test for evaluating the sensitivity of input variables
12.3.3 Multiple linear regression
12.3.4 Performance evaluation indicators
12.4 Results and discussion
12.4.1 Identification of appropriate input variables using gamma test
12.4.2 Predicting daily runoff using hybrid AI models
12.5 Conclusions
References
Chapter-13---Flood-forecasting-and-error-simulati_2021_Advances-in-Streamflo
13. Flood forecasting and error simulation using copula entropy method
13.1 Introduction
13.2 Background
13.2.1 Artificial neural networks
13.2.2 Entropy theory
13.2.3 Copula function
13.3 Determination of ANN model inputs based on copula entropy
13.3.1 Methodology
13.3.1.1 Copula entropy theory
13.3.1.2 Partial mutual information
13.3.1.3 Input selection based on copula entropy method
13.3.2 Application of copula entropy theory in flood forecasting—a case study
13.3.2.1 Study area and data description
13.3.2.2 Flood forecasts at Three Gorges Reservoir
13.3.2.3 Flood forecasting at the outlet of Jinsha River
13.3.2.4 Performance evaluation
13.3.2.5 Results of selected model inputs
13.4 Flood forecast uncertainties
13.4.1 Distributions for fitting flood forecasting errors
13.4.2 Determination of the distributions of flood forecasting uncertainties at TGR
13.5 Flood forecast uncertainty simulation
13.5.1 Flood forecasting uncertainties simulation based on copulas
13.5.2 Flood forecasting uncertainties simulation
13.6 Conclusions
References
Appendix-1---Books-and-book-chapters-on-data-d_2021_Advances-in-Streamflow-F
1 - Books and book chapters on data-driven approaches
Appendix-2---List-of-peer-reviewed-journals-on-_2021_Advances-in-Streamflow-
2 - List of peer-reviewed journals on data-driven approaches
Appendix-3-Data-and-software_2021_Advances-in-Streamflow-Forecasting
3 - Data and software
Web resources for open data sources of streamflow
Software packages for streamflow modeling and forecasting
Index_2021_Advances-in-Streamflow-Forecasting
Index
A
B
C
D
E
F
G
H
K
L
M
N
O
P
R
S
T
U
V
W
Z
Recommend Papers

Advances in Streamflow Forecasting: From Traditional to Modern Approaches [1 ed.]
 012820673X, 9780128206737

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Advances in Streamflow Forecasting From Traditional to Modern Approaches

Edited by Priyanka Sharma Deepesh Machiwal

Elsevier Radarweg 29, PO Box 211, 1000 AE Amsterdam, Netherlands The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom 50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States Copyright Ó 2021 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein).

Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-12-820673-7 For information on all Elsevier publications visit our website at https:// www.elsevier.com/books-and-journals Publisher: Candice Janco Acquisitions Editor: Louisa Munro Editorial Project Manager: Aleksandra Packowska Production Project Manager: Kiruthika Govindaraju Cover Designer: Mark Rogers

Typeset by TNQ Technologies

This book is dedicated to my parents Sudha Sharma and Pramod Sharma, my husband Basant Mishra, and son Advit Mishra. dPriyanka Sharma This book is dedicated to my loving family, sisters, brother, parents Devki Machiwal and Durga Prasad Machiwal, my wife Savita, and daughter Mahi. dDeepesh Machiwal

Contributors Kevin O. Achieng, Department of Civil and Architectural Engineering, University of Wyoming, Laramie, WY, United States; Department of Crop & Soil Sciences, University of Georgia, Athens, GA, United States Jan F. Adamowski, Department of Bioresource Engineering, Faculty of Agricultural and Environmental Science, McGill University, Montreal, QC, Canada Sheikh Hefzul Bari, Department of Civil Engineering, Leading University, Sylhet, Bangladesh Lu Chen, School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan, Hubei, China Nastaran Chitsaz, National Centre for Groundwater Research and Training, College of Science and Engineering, Flinders University, Bedford Park, South Australia, Australia Ali Danandeh Mehr, Department of Civil Engineering, Antalya Bilim University, Antalya, Turkey Ravinesh C. Deo, School of Agricultural Computational and Environmental Sciences, International Centre of Applied Climate Sciences (ICACS), University of Southern Queensland, Springfield, QLD, Australia Farshad Fathian, Department of Water Science and Engineering, Faculty of Agriculture, Vali-e-Asr University of Rafsanjan, Rafsanjan, Kerman Province, Iran Salim Heddam, Faculty of Science, Agronomy Department, Hydraulics Division, Laboratory of Research in Biodiversity Interaction Ecosystem and Biotechnology, University 20 Aouˆt 1955, Skikda State, Algeria Md Manjurul Hussain, Institute of Water and Flood Management, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh Saeid Janizadeh, Department of Watershed Management, Faculty of Natural Resources and Marine Sciences, Tarbiat Modares University, Noor, Mazandaran Province, Iran V. Jothiprakash, Department of Civil Engineering, Indian Institute of Technology Bombay, Mumbai, Maharashtra, India ¨ Ozgur Kişi, Department of Civil Engineering, School of Technology, Ilia State University, Tbilisi, Georgia Anil Kumar, Department of Soil and Water Conservation Engineering, College of Technology, G.B. Pant University of Agriculture and Technology, Pantnagar, Uttarakhand, India

xiii

xiv Contributors Andreas Langousis, Department of Civil Engineering, School of Engineering, University of Patras, University Campus, Rio, Patras, Greece Deepesh Machiwal, Division of Natural Resources, ICAR-Central Arid Zone Research Institute, Jodhpur, Rajasthan, India Ishtiak Mahmud, Civil and Environmental Engineering, Shahjalal University of Science and Technology, Sylhet, Bangladesh Arash Malekian, Department of Reclamation of Arid and Mountainous Regions, Faculty of Natural Resources, University of Tehran, Tehran, Iran Anurag Malik, Punjab Agricultural University, Regional Research Station, Bathinda, Punjab, India; Department of Soil and Water Conservation Engineering, College of Technology, G.B. Pant University of Agriculture and Technology, Pantnagar, Uttarakhand, India Georgia Papacharalampous, Department of Water Resources and Environmental Engineering, School of Civil Engineering, National Technical University of Athens, Iroon Polytechniou 5, Zografou, Greece P.L. Patel, Department of Civil Engineering, Sardar Vallabhbhai National Institute of Technology, Surat, Gujarat, India Mir Jafar Sadegh Safari, Department of Civil Engineering, Yaşar University, Izmir, Turkey Priyank J. Sharma, Department of Civil Engineering, Sardar Vallabhbhai National Institute of Technology, Surat, Gujarat, India Priyanka Sharma, Groundwater Hydrology Division, National Institute of Hydrology, Roorkee, Uttarakhand, India Mohammad Istiyak Hossain Siddiquee, Data and Knowledge Engineering, Otto-vonGuericke University of Magdeburg, Magdeburg, Saxony-Anhalt, Germany Vijay P. Singh, Department of Biological & Agricultural Engineering and Zachry Department of Civil & Environmental Engineering, Texas A&M University, College Station, TX, United States Doudja Souag-Gamane, Leghyd Laboratory, University of Sciences and Technology Houari Boumediene, Bab Ezzouar, Algiers, Algeria Yazid Tikhamarine, Southern Public Works Laboratory (LTPS), Tamanrasset Antenna, Tamanrasset, Algeria; Department of Science and Technology, University of Tamanrasset, Sersouf, Tamanrasset, Algeria Mukesh K. Tiwari, Department of Irrigation and Drainage Engineering, College of Agricultural Engineering and Technology, Anand Agricultural University, Godhra, Gujarat, India Hristos Tyralis, Air Force Support Command, Hellenic Air Force, Elefsina Air Base, Elefsina, Greece Mehdi Vafakhah, Department of Watershed Management, Faculty of Natural Resources and Marine Sciences, Tarbiat Modares University, Noor, Mazandaran Province, Iran

About the editors Dr. Priyanka Sharma is currently working as a Research Associate under National Hydrology Project in Groundwater Hydrology Division at National Institute of Hydrology (NIH), Roorkee, India. She completed her B.Tech (Agricultural Engineering) from Chandra Shekhar Azad University of Agriculture and Technology, Kanpur, India, in 2012. She obtained her M.Tech in 2014 and PhD in 2018 from Maharana Pratap University of Agriculture and Technology (MPUAT), Udaipur. Between March and June 2016, she worked as a Senior Research Fellow in Department of Soil and Water Engineering, College of Technology and Engineering (CTAE), MPUAT, Udaipur, India. From January to June 2018, Priyanka worked as an Assistant Professor at the School of Agriculture, Lovely Professional University, Punjab, India. She also worked as an Assistant Professor in the Faculty of Agriculture Science, Maharishi Arvind University, Jaipur, Rajasthan, India. Her research interests include application of statistical and stochastic time series modeling techniques and modern data-driven techniques such as artificial intelligence in solving problems related to hydrology and water resources. She has published seven research papers in reputed peer-reviewed journals and conferences. She has also contributed three book chapters. She has been conferred with JAE Best Paper Award and Distinguished Scientist Associate Award for her outstanding research works in the field of hydrology. She is a life member of two national professional societies. Dr. Deepesh Machiwal is a Principal Scientist (Soil and Water Conservation Engineering) in Division of Natural Resources at ICAR-Central Arid Zone Research Institute (CAZRI), Jodhpur, India. He obtained his PhD from Indian Institute of Technology, Kharagpur, in 2009. He has more than 20 years of experience in soil and water conservation engineering and groundwater hydrology. His current research area is modeling groundwater levels in Indian arid region under the changing climate and groundwater demands. Deepesh served from 2005 to 2011 as an Assistant Professor in the all India coordinated research project on groundwater utilization at College of Technology and Engineering, Udaipur, India. He has worked as co-principal investigator in three externally funded research projects funded by ICARDA, ICAR, and Government of Rajasthan, India. He has authored 1 book, edited 2 books, and contributed 19 book chapters. Deepesh has to his credit 39 papers in international and 19 papers in national journals, 2 technical reports, 4 extension xv

xvi About the editors

bulletins, 16 popular articles, and 33 papers in conference proceedings. His authored book entitled, Hydrologic Time Series Analysis: Theory and Practice, has been awarded by Outstanding Book Award for 2012-13 from ISAE, New Delhi, India. He has been awarded Commendation Medal Award in 2019 by ISAE, Best Paper Award 2018 by CAZRI, Jodhpur, Achiever Award 2015 by SADHNA, Himachal Pradesh, Distinguished Service Certificate Award for 2012-2013 by ISAE, and IEI Young Engineer Award in 2012 by the Institution of Engineers (India), West Bengal. He is a recipient of Foundation Day Award of CAZRI for 2012, 2013, and 2014 and Appreciation Certificate from IEI, Udaipur, in 2012. Earlier, he was awarded Junior Research Professional Fellowship by IWMI, Sri Lanka, to participate in International Training and Research Program on Groundwater Governance in Asia: Theory and Practice. He has been conferred with Second Best Comprehensive Group Paper Award by IWMI, Sri Lanka, in 2007. He was also sponsored by FAO, Rome, and UN-Water for participating in two international workshops at China and Indonesia. He is a life member of eight professional societies and associations. Currently, Deepesh is serving as an Advisory Board Member of Ecological Indicators (Elsevier) and has served as an Associate Editor for Journal of Agricultural Engineering (ISAE) during 2018-20. He is a reviewer of several national and international journals related to soil and water engineering and hydrology.

Foreword Water-related disasters, also called hydro-hazards, are among the most frequently occurring natural hazards that threaten people as well as socioeconomic development. The Emergency Events Database of 2019 shows that globally water-related disasters accounted for about 74% of all natural disasters between 2001 and 2018, and floods and droughts alone caused more than 166,000 deaths during the past 20 years, affected over 3 billion people, and caused an economic damage of almost US$700 billion worldwide. Unfortunately the number of people to be impacted and the economic losses to be caused are projected to rise in the future due to growing population in flood-prone areas, climate change, global warming, deforestation, loss of wetlands, increasing hurricanes, unplanned development, and rising sea level. Flood hazards may be mitigated through flood preparedness involving the development of flood risk management systems which depend on streamflow modeling and forecasting. Reliable and accurate streamflow forecasting is essential for the optimal planning and management of water resources systems. A variety of models, ranging from knowledge-based physical models to data-driven empirical models, have been proposed over the years to derive streamflow forecasts. The past few decades have witnessed a proliferation in data-driven models for streamflow forecasting. This past decade, in particular, has seen a precipitous shift from traditional models employing stand-alone data-driven techniques to advanced hybrid models integrating more than one data-driven technique with some sort of a data preprocessing technique to improve forecast accuracy. Even the classical autoregressive moving average (ARMA) models have been advanced into nonlinear autoregressive with exogenous input (NARX) models, self-exciting threshold autoregressive (SETAR) models, generalized autoregressive models with conditional heteroscedasticity (GARCH), and hybridized SETAR-GARCH models to improve the accuracy of streamflow forecasting. Details about such advances are normally not found in a single source. Furthermore, several advances have been made in artificial intelligence techniques, such as artificial neural networks, support vector machines, support vector regression, and genetic programming with both stand-alone and hybrid approaches that have improved model performance and have provided better streamflow forecasts. Therefore, this book is a valuable contribution to the field of hydrology where traditional as well as modern models of streamflow forecasting involving statistical, stochastic, and artificial intelligence xvii

xviii Foreword

techniques are described along with case studies selected from different parts of the world that illustrate their applications to real-world data. The book also provides an overview of major approaches employed in advancing streamflow forecasting over half a century. Each book chapter maintains a proper balance of theory and practical demonstration of the underlying data-driven technique employed for streamflow forecasting. The book provides information on data-driven tools for streamflow forecasting, which will guide the reader in selecting a suitable technique to forecast streamflow under a given set of conditions. Chapter contributors are active researchers who have been involved in advancing streamflow forecasting. The book editors deserve applause for bringing out this valuable book. Vijay P. Singh, P.D., D.S., P.E., P.H., Hon. D. WRE University Distinguished Professor Regents Professor Caroline and William N. Lehrer Distinguished Chair in Water Engineering Department of Biological and Agricultural Engineering and Zachry Department of Civil and Environmental Engineering, Texas A&M University, College Station, TX, United States

Preface Streamflow forecasting plays a key role in water resources planning and management including irrigated agriculture, hydropower production, and destructive natural disasters. Accurate forecasting of streamflow is a challenging task and at the same time a very complex process. Therefore, researchers from all over the world have proposed many models to forecast streamflow with improved performance. Until now, a lot of research has been conducted to address this complex process by using many data-driven models. Accurate streamflow forecasting has received much attention over the last two decades. Since 2010, efforts of researchers in increasing forecasting accuracy of streamflow have gained a considerable momentum for long-term water resources planning and management. Before 2000s, academicians and researchers mainly focused on streamflow forecasting by using traditional methods such as statistical and stochastic time series modeling methods. However, in 2010s, focus of the researchers has been shifted toward the advances in streamflow forecasting methods by adopting new stand-alone data-driven approaches such as artificial intelligence techniques and hybrid data-driven approaches where more than one technique, preferably artificial intelligence technique, is amalgamated with other methods. Furthermore, it is learned that the recent advances made in the subject of streamflow forecasting are mostly confined up to research articles where detailed procedures of applying the modern methods for streamflow forecasting are not adequately dealt. It is revealed from the literature that a book describing the step-by-step procedures of the advanced data-driven models along with their case studies in streamflow forecasting is not available. Therefore, this book is an attempt to abridge this gap by providing theoretical descriptions, systematic methodologies, and practical demonstration of the traditional and modern tools and techniques adopted for streamflow forecasting over the years. Thus, the book is very useful for the readers to gain an insight about the developments made in the field of the streamflow forecasting. Furthermore, this book deals with “theory-to-practice” approach where procedures for applying the different methods are explained in concrete and ordered steps that may be easily followed by the readers if they wish to apply any of the methods in their studies. It may be difficult for the readers to find all such details about the advanced methods at a single source in the literature. Moreover, the methods detailed in this book may be useful for the readers to make forecast of a variable in any subject including hydrology and water resources engineering.

xix

xx Preface

This book comprises a total of 13 chapters that cover most-promising hydrological data-driven prediction approaches, which have been used to develop accurate yet efficient forecasting models and forecast streamflows during the earlier and recent times. Chapter 1 provides a detailed overview of the streamflow forecasting models and advances made in traditional as well as modern data-driven techniques. A comprehensive review of the literature is provided based on the studies related to streamflow forecasting at different timescales based on data-driven techniques. Chapters 2 and 3 describe traditional methods and their recent advances including statistical linear and nonlinear time series models such as exponential smoothing and autoregressive fractionally integrated moving average (ARFIMA) models, vector autoregressive without/ with exogenous variables (VAR/VARX), and multiple/multivariate generalized autoregressive conditional heteroscedasticity (MGARCH). Chapters 4e10 contain the advance stage of development and verification of highly complex or nonlinear streamflow forecasting models involving artificial intelligence approaches. These chapters explain the concept, procedure, and application of stand-alone artificial intelligence models such as artificial neural network, adaptive neuro-fuzzy inference system, genetic programming, gene expression programming, model tree technique, support vector regression, and extreme learning machines to forecast streamflows. These chapters also involve the comparison among the salient artificial intelligence methods. When the streamflow data are highly nonstationary, artificial intelligence methods may not be able to simulate and forecast the streamflow without pre/postprocessing of the input/output data. Recently, in such situations, hybrid approaches that combine data preprocessing and artificial intelligence techniques are being increasingly adopted as an important tool to improve the forecast accuracy of streamflow. Hence, Chapters 12 and 13 of the book include the recent hybrid approaches that are progressively being used to improve the forecast accuracy and to reduce the uncertainties in streamflow forecasting. Editors believe that the book may be very much useful to students, researchers, and academicians as well as planners, managers, and policy-makers involved in sustainable development and management of water resources. Priyanka Sharma Deepesh Machiwal

Acknowledgment This book would have not been possible without support of many people. The first editor (Priyanka Sharma) would like to express her sincere gratitude and due respect to Dr. Surjeet Singh, Scientist F, and Dr. J.V. Tyagi Director, National Institute of Hydrology (NIH), Roorkee, India. She is immensely grateful for their valuable support and continuous encouragement. The second editor (Deepesh Machiwal) gratefully acknowledges the support and motivation provided by Dr. O.P. Yadav, Director, ICAR-Central Arid Zone Research Institute (CAZRI), Jodhpur, India. Deepesh would like to express his feelings to his mentor, Dr. Madan Kumar Jha, Professor, Indian Institute of Technology (IIT) Kharagpur, India, who has been a constant source of inspiration to him. He further feels a sense of indebtedness to Dr. Adlul Islam, Principal Scientist, NRM Division, Indian Council of Agricultural Research (ICAR), New Delhi, India, for his stimulation to do something creative. The editors acknowledge the generous support and inspiration of their friends and colleagues received during the entire course of this book project. The editors thank all the chapter contributors for their selfless and quality contribution to this book. They sincerely appreciate a lot of valuable insights arising from experience and dedication of the contributors that enriched the material presented in the book. They would like to thank all the reviewers for their intuitive suggestions and comments, which have led to numerous improvements in content of the chapters. Each chapter of the book has been revised at least twice. The editors are glad to have a foreword from Professor Vijay P. Singh, Department of Biological & Agricultural Engineering and Zachry Department of Civil & Environmental Engineering, Texas A&M University, College Station, TX, United States. The editors are thankful to the Elsevier book project team that assisted them from time to time at various steps of publishing process since the beginning of this book project. The editors are delighted to specially acknowledge Louisa Munro (Acquisitions Editor: Aquatic Sciences), Fisher Michelle (Acquisitions Editor for Molecular Biology), Hannah Makonnen (Editorial Project Manager), Aleksandra Packowska (Editorial Project Manager), and Kiruthika Govindaraju (Senior Project Manager) for their professional assistance throughout the

xxi

xxii Acknowledgment

publishing process of the book. A special word of appreciation is also extended to the designer team of the publisher for their prompt actions in revising and improving the cover page of the book. Priyanka Sharma Deepesh Machiwal

Chapter 1

Streamflow forecasting: overview of advances in data-driven techniques Priyanka Sharma1, Deepesh Machiwal2 1

Groundwater Hydrology Division, National Institute of Hydrology, Roorkee, Uttarakhand, India; Division of Natural Resources, ICAR-Central Arid Zone Research Institute, Jodhpur, Rajasthan, India

2

1.1 Introduction Runoff water generated from the precipitation may reach a stream by overland flow, subsurface flow, or both and move toward the oceans in a channelized form and is called streamflow or river flow. Streamflow is generated by a combination of baseflow (return from groundwater), interflow (rapid subsurface flow through macropores and seepage zones), and saturated overland flow (Mosley and McKerchar, 1993). A schematic diagram of three components of streamflow, i.e., interflow, saturated overland flow, and baseflow, is shown in Fig. 1.1 under absence and presence of rainfall event. Streamflow, expressed as discharge in units of cubic feet per second (ft3/s) or cubic meters per second (m3/s), is the only phase of the hydrological cycle in which the water is confined in well-defined channels which permit accurate measurements to be made of the quantities involved. At the same time, streamflow is one of the most complex quantitative parameters that takes place in a stream or channel and varies in time and space (Wiche and Holmes, 2016). Analysis of the streamflow data provides us description of river flow regime, enables us to compare between rivers, and helps in prediction of future river flows (Davie, 2008). Streamflow is experiencing long-term changes as the freshwater demands are increasing worldwide due to increase in population and changing climate (Oki and Kanae, 2006). However, it is difficult to predict the changes in future streamflows as it involves a physical process that depends upon more than one random variable such as precipitation, evapotranspiration, topography, and human activities. Hence, it is a very complex and nonlinear process of hydrologic cycle that is not well understood. This necessitates prediction or forecasting of future streamflows for efficient water resources planning and Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00013-5 Copyright © 2021 Elsevier Inc. All rights reserved.

1

2 Advances in Streamflow Forecasting

FIGURE 1.1 Schematic diagram showing interflow, saturated overland flow, and baseflow components of streamflow during (A) dry period and (B) rainy period. Modified from Mosley, M.P. and McKerchar, A.I. 1993. Streamflow, chapter 8. In: D.R. Maidment (Editor In Chief), Handbook of Hydrology, McGraw-Hill, Inc., New York, pp. 8.1e8.35.

management. Streamflow forecasting may be performed for short-term for defending from floods, medium-term for reservoir operation, and long-term for water resources planning and management. Streamflows have been forecasted using both physically-based and datadriven models. In literature, a wide range of both types of models has been proposed and applied in streamflow forecasting. The subject of streamflow forecasting has been flourishing well with a large number of studies being continuously reported over the past five decades where numerous forecasting approaches are used. Hydrologists are dynamic in proposing and adopting new tools and techniques and/or refining the existing methods in streamflow forecasting to overcome the drawbacks of the older approaches. It has been revealed that the streamflow forecasts are mainly affected by the great uncertainty especially in ungauged or poorly gauged basins where goodquality streamflow data of adequate time period are not available. In addition, chaotic behavior of streamflow process with nonlinear and nonstationary time series is another major reason of unsatisfactory forecasts. Therefore, researchers have made advances in understanding the physical process as well as in analyzing the process empirically using modern computing techniques over the years. This chapter aims at presenting advances made in streamflow forecasting using data-driven techniques/models. First, data-driven models ranging from traditional to modern techniques are classified into suitable groups depending on the nature of the models. Then historical development and application of different data-driven models in streamflow forecasting is detailed. Furthermore, a section of comparative evaluation of data-driven models is presented and current trends in the recent studies are highlighted. Moreover, key challenges experienced in making accurate streamflow forecasts are discussed and concluding remarks are provided.

1.2 Measurement of streamflow and its forecasting Measuring, estimating, and/or predicting streamflow is an important task in surface water hydrology. There exist a variety of methods for monitoring

Streamflow forecasting Chapter | 1

3

streamflow and each method remains specific to a particular type of stream. The methods to quantify and monitor the streamflow are grouped into four categories (Dobriyal et al., 2017): (i) direct measurement methods, (ii) velocity-area methods, (iii) constricted flow methods, and (iv) noncontact measurement methods (Fig. 1.2). An overview of the methods used for streamflow monitoring is provided by Mosley and McKerchar (1993), and their advantages and disadvantages are presented by Dobriyal et al. (2017). Suitability of a method for streamflow monitoring depends on the factors such as water quantities to be measured, degree of accuracy, permanent or temporary installation, and cost incurred (Dobriyal et al., 2017). Many activities associated with the planning, operation, management, and control of a water resource system require forecasts of future streamflow, which is a challenging task for water resources engineers, planners, and managers.

FIGURE 1.2 Classification of methods used for streamflow monitoring. Modified from Dobriyal, P., Badola, R., Tuboi, C. and Hussain, S.A. 2017. A review of methods for monitoring streamflow for sustainable water resource management. Applied Water Science, 7: 2617e2628.

4 Advances in Streamflow Forecasting

Accurate streamflow forecasts are needed for the efficient operation of water resources systems within technical, economical, legal, and political priorities (Salas et al., 2000). Streamflow forecasts should take into account spatial and temporal variability of entire streamflow field for a sound control and management of the water resources system. It is worth-mentioning that streamflow forecasting can be of two types depending upon the temporal scale at which forecasts are made (Yaseen et al., 2015): (i) short-term or real-time forecasting with lead times of hours and days, which is crucial for real-time reservoir operation and reliable operation of flood warning and mitigation systems, and (ii) long-term forecasting with lead times varying from weeks to months, which is important for the planning and operation of reservoirs, hydropower generation, sediment transport, drought analysis and mitigation, irrigation management decisions, scheduling releases and many other applications.

1.3 Classification of techniques/models used for streamflow forecasting In literature, the main objective of employing streamflow forecasting models is studying the operation of a hydrologic system in order to predict its behavior. Thus, streamflow prediction and forecasting has become one of the most important issues in hydrology over the past few decades and is currently attracting hydrologists to advancing research on accurate hydrologic predictions. In this section, a brief overview of the state-of-the-art scientific approaches applied in hydrology especially for streamflow modeling and forecasting is provided. Presently, there exist a variety of techniques used for modeling a hydrologic process, and it is very imperative to first distinguish among several types of techniques/models used to enhance the reliability and accuracy of forecasting hydrological variables and to account for the scale- and time dependency of their errors in evaluation. Over the last three to four decades, a large number of models have been proposed for hydrologic time series prediction and forecasting and to improve the hydrologic forecasting accuracy. Broadly, these models are classified into three categories (Bourdin et al., 2012; Devia et al., 2015; Liu et al., 2018): (i) physical-based models, (ii) conceptual models, and (iii) black-box models (Fig. 1.3). The physical-based models, also called whitebox or process-based models, describe hydrological characteristics in detail by solving differential equations describing the physical laws of mass, energy, and momentum conservations. The conceptual models, also known as gray-box models, are a descriptive representation of hydrologic system that incorporates the modeler’s understanding of the relevant physical, chemical, and hydrologic conditions. On the other hand, black-box models, sometimes called empirical or statistical models, are based on inputeoutput relationships from a statistical point of view rather than physical principles. The latter two categories of the models do not describe the underlying hydrologic

Streamflow forecasting Chapter | 1

5

FIGURE 1.3 Schematic of black-, gray-, and white-box models of streamflow forecasting.

processes. Now-a-days, the hydrologic models are classified into only two categories based on availability of knowledge and data about the system being modeled (Solomatine and Ostfeld, 2008) (i) conceptual or physically based (process-based or knowledge-driven) models that describe the physical phenomenon or system and (ii) data-driven (empirical or statistical) models involving mathematical equations assessed from analysis of concurrent input and output time series. Process-based or physical models provide accurate estimates of the hydrologic variable, although they require a lot of physical data and a complex mathematical representation of hydrologic system. In context of a large-scale hydrological system representing several complex hydrological processes, obtaining accurate and precise sitespecific predictions is a challenging task in many situations using the physically based hydrological models as well as linear regression and time series models. The physical models usually do not attempt to take into account the stochastic nature of the underlying hydrologic system, and the linear regression models do not consider the nonlinear dynamics of hydrologic processes. Hence, the nonlinear statistical models are suggested and applied in hydrological forecasting (Jacoby, 1966; Amorocho and Brandstetter, 1971; Tong, 1990). However, it is difficult to formulate nonlinear models with reasonable adequacy, and hence, in the latter studies, artificial neural network (ANN) models were adopted for complex hydrologic modeling (Saad et al., 1996; Clair and Ehrman, 1998; Jain et al., 1999; Coulibaly et al., 2000). In streamflow forecasting applications, data-driven models are becoming increasingly popular due to the advantages of their rapid developmental times, minimum data requirement, and ease of real-time implementation. In datadriven streamflow forecasting, initially conventional or statistical and time series analysis models were used, which involve multiple linear regression, autoregressive moving average (ARMA), and other modified versions of

6 Advances in Streamflow Forecasting

ARMA models. In time series modeling applications, two basic assumptions are made about the streamflow time series. According to the first assumption, a streamflow time series is considered to be originated from a random probability distribution or pattern with an infinite number of degrees of freedom. Under this assumption, conventional or traditional linear time series models, such as ThomaseFiering, ARMA, autoregressive integrated moving average (ARIMA), seasonal autoregressive integrated moving average (SARIMA), fractionally differenced autoregressive integrated moving average (FARIMA), autoregressive fractionally integrated moving average (ARFIMA), etc., have been successfully employed for forecasting streamflows (e.g., Carlson et al., 1970; Hipel et al., 1977; Salas et al., 1980; Salas et al., 1985; Haltiner and Salas, 1988; Yu and Tseng, 1996; Montanari et al., 1997; Kothyari and Singh, 1999; Abrahart and See, 2000; ; Huang et al., 2004; Maria et al., 2004; Amiri, 2015; Valipour, 2015; Sharma et al., 2018; Papacharalampous and Tyralis, 2020). The second assumption is that a streamflow time series is derived from a deterministic dynamic system such as chaos. Time series models are capable of capturing a linear statistical dependence among several successive time lags (Stojkovic et al., 2017). Over the past twoethree decades, nonlinear or complex streamflow forecasting models have been increasingly applied for making streamflow forecasts (e.g., Jayawardena and Lai, 1994; Jayawardena and Gurung, 2000; Elshorbagy et al., 2002; Wang et al., 2006a; Wu et al., 2009; Modarres and Quarda, 2013; Amiri, 2015). Prediction of streamflow by time series analysis models can be further improved by coupling with empirical model decomposition (black-box models), an approach that is useful for analyzing nonstationary hydrologic time series (Wang et al., 2015). Thereafter, ANN models were extensively applied in forecasting streamflows. Basically, the ANN applications offer a useful tool in solving problems related with pattern recognition, nonlinear control, and time series prediction. The ANN models are very effective in dealing with a large amount of dynamic, nonlinear, and noisy time series data, especially under the case when physical processes of the underlying system and their interrelationships are not completely known (See and Openshaw, 2000; Kasabov, 1996). However, all linear and nonlinear conventional models have the limitations when nonstationary remains present in streamflow time series (Admowski, 2008). In order to overcome the problems with nonstationary time series data, wavelet transformation of the original streamflow data was investigated in a number of studies, which was found very effective in improving the accuracy of forecasted streamflows. Fuzzy logic or fuzzy inference system (FIS) is another promising tool that is adopted in many studies for generating real as well as synthetic time series of streamflow forecasts. Jacquin and Shamseldin (2009) provide a review of the applications of FIS in streamflow forecasting. They reported that the FIS can be used as effectively as the ANN for forecasting a hydrologic time series, though the former could not get as much popularity as the latter models among the researchers. However, the FIS technique was

Streamflow forecasting Chapter | 1

7

subsequently integrated with the ANN to enhance the latter’s capability in yielding more reliable forecasts of streamflows. There are several studies in literature where the adaptive neuro-fuzzy inference system (ANFIS) was employed for streamflow forecasting, which indicated that the performance of the ANFIS model was found better than the standalone ANN models (e.g., Chau et al., 2005; Nayak et al., 2005a; Aqil et al., 2007; Mukerji et al., 2009; Pramanik and Panda, 2009). In addition, other artificial intelligence (AI)-based techniques/models being capable of analyzing and forecasting large-scale and nonlinear streamflow datasets have been adopting for generation of reliable streamflow forecasts since the 1990s. The model of AI-based techniques, which works on the understanding of the brain and nervous systems, has become increasingly popular among researchers and academicians working in the field of hydrology and water resources. Despite the fact that the AI-based models are the suitable in simulating and forecasting streamflow time series, there are some problems with their application under the condition of highly nonlinear or chaos-based streamflow data. In last 5 years, the nonlinear AI models exhibited the outstanding results with a harmony between the forecasted and observed streamflow data (i.e., Huang et al., 2014; Chen et al., 2015; Meshgi et al., 2015; Zhang et al., 2015; Deo and Sahin, 2016; Kasiviswanathan et al., 2016; Yaseen et al., 2016a; Matos et al., 2018; Mehdizadeh and Sales, 2018; Guo et al., 2020; Niu et al., 2021). In recent years, with phenomenal advances in AI-based research, there is a trend of hybridizing two or more data-driven models for getting more accurate streamflow forecasts. It is seen that many data-driven techniques, such as wavelet transformation, singular spectrum analysis (SSA), particle swarm optimization (PSO), etc., are used as data preprocessing tools and are combined with other existing AI-based models. The hybrid approach utilizing integration of two or more data assimilation and modeling techniques increases the precision and accuracy in forecasted streamflows (Fahimi et al., 2017). Moreover, it is found that the hybrid AI-based models have the more capability to describe observed hydrological data statistically and to explore other unseen information hidden in observed data records (Nourani et al., 2014). Therefore, use of the hybrid models is emphasized in streamflow forecasting studies to utilize the advantage of the individual AI-based or datadriven models’ capability in dealing with inaccuracy in the forecasted streamflows. The data-driven models may be classified into many overlapping categories such as AI, computational intelligence, soft computing, machine learning, data mining, and intelligent data analysis, among others (Solomatine et al., 2009). Zhang et al. (2018a,b) classified data-driven forecasting models into three categories: (i) evolutionary algorithms (genetic algorithm [GA], PSO, genetic programming [GP]), (ii) fuzzy logic algorithms (fuzzy logic systems and neuro-fuzzy), and (iii) classification methods and artificial networks techniques (self-organization mapping, SSA, ANN, and support vector

8 Advances in Streamflow Forecasting

machine [SVM]). Another classification scheme, proposed by Yaseen et al. (2015), subcategorized the data-driven models into four groups: (i) classifier and machine learning approaches (ANN, decision trees, SVM, etc.), (ii) fuzzy sets, (iii) evolutionary computation (GP, gene expression programming [GEP], GA, PSO, ant colony optimization [ACO], etc.), and (iv) wavelet conjunction models (discrete and continuous wavelet transformations). A classification chart of data-driven models ranging from traditional to modern techniques used in streamflow forecasting is illustrated in Fig. 1.4.

1.4 Growth of data-driven methods and their applications in streamflow forecasting 1.4.1 Time series modeling Time series modeling, including exponential smoothing technique, originated in the 1950 and 1960s with the pioneer works of Brown (1959, 1963), Holt (1957, reprinted 2004), and Winters (1960). It is the first data-driven technique that was widely applied for forecasting purpose in business and industry fields much before seeing its applications in hydrology. Later on, stochastic modeling of time series became a popular tool in hydrological forecasting. The notion of stochasticity was first launched by Yule (1927) who suggested the fact that every time series can be considered as the realization of a stochastic process. However, uses of stochastic models for modeling and forecasting of streamflow datasets did not begin until about 1960 (Chiu, 1972). A significant progress in time series modeling was made by Box and Jenkins (1970) by developing a standard and systematic methodology for the application of time series modeling and forecasting through four stages: (i) identification of suitable model, (ii) parameter estimation, (iii) model validation, and (iv) forecasting. On the basis of the simple idea of time series modeling, many time series approaches such as ThomaseFiering, ARMA, and autoregressive integrated moving average (ARIMA), autoregressive integrated movinge average with exogenous input (ARIMAX), linear regression (LR), and multiple linear regression (MLR) have been developed since then, and the same are applied for forecasting of streamflows since 1970s (Harms and Campbell, 1967; Carlson et al., 1970; Bonne´, 1971; Ledolter, 1978; Sen, 1978; Alley, 1985; Awwad et al., 1994; Montanari et al., 1997). In 1985, structural time series modeling was used for streamflow forecasting where Kalman filtering technique, originally developed by Kalman (1960), was applied to forecast river flows of Strugeon River in northern Ontario, Canada (Burn and McBean, 1985). Krstanovic and Singh (1991a) first developed a univariate streamflow forecasting model by employing maximum entropy spectral analysis (MESA). Then, they used the developed models for long-term streamflow forecasting for River Orinoco and River Caroni in South America, River Krishna and River Godavari in India, and Spring Creek in Louisiana, USA, and compared

Data-Driven Models for Streamflow Forecasting Statistical Models Regressionbased Models

Markov Process

Artificial Intelligence-based Models

Stochastic Time-Series

Bootstrap Resampling

MLR

T-F Model

ARMA

SARIMA

MBB

(Bonné, 1971)

(Harms and Campbell, 1967)

(Carlson et al., 1970)

(McKerchar and Delleur, 1974)

(Vogel and Shallcross, 1996)

PCR (Garen, 1992)

PLSR (Abudu et al., 2010)

LLR and DLLR (Kişi et al., 2012)

Neural Network

Fuzzylogic

Support Vector Machine/ Regression

FFNN and CCNN

FCM

SVM

(Karunanithi et al., 1994)

(Wu et al., 2009)

(Lin et al., 2006)

Genetic Programming

Time-Frequency Decomposition

Model Tree

Extreme Learning Machine

GP

DWT

M5Tree

OSELM

(Babovic and Keijzer, 2002)

(Smith et al., 1998)

(Solomatine and Dulal, 2003)

(Lima et al., 2016)

ARIMA

PARMA

k-NN

RNN

LGP

SLFNs

(Ledolter, 1978)

(Salas et al., 1982)

CDSVR

(Yakowitz,1979)

(Lall and Sharma, 1996)

(Kumar et al., 2004)

(Wu et al., 2009)

(Guven, 2009)

(Yaseen et al., 2016a)

Two and ThreeState MC (Aksoy, 2003)

AR, PAR, ARX and PARX (Alley, 1985)

Kalman-Filter

HMM

FGN, FDIFF and FARMA (Noakes et al., 1988)

Markov-based Models

(Akıntuğ and Rasmussen, 2005)

(Burn and McBean, 1985)

Modified k-NN (Prairie et al., 2006)

GRNN

LSSVM

MGGP

(Cigizoglu, 2005)

(Shabri and Suhartono, 2012)

(Danandeh Mehr and Kahya, 2017)

GAR

RBFNN

LSSVR

(Fernandez and Salas, 1990)

(Kişi, 2008)

(Al-Sudani et al., 2019)

SSM

ARMAX

CDANN

(Krstanovic and Singh, 1991b)

(Awwad et al., 1994)

(Wu et al., 2009)

FARIMA

PARFIMA and SPARFIMA (Ooms and Franses, 2001)

TLRN

(Montanari et al., 1997) NAARX, NeTAR and MNITF (Astatkie, 2006)

TAR, STAR and SETAR (Komorník et al., 2006)

SETARMA (Amendola et al., 2006)

TARMA, ODM, ARCH and BL (Chen et al., 2008)

FCARSE

GARCH

(Shao et al., 2009)

(Modarres and Ouarda, 2013)

EXPAR and MARS (Amiri, 2015)

DARMA (Elganiny and Eldwer, 2018)

(Sattari et al., 2012)

MFNN (Kim and Seo, 2015)

LSTM (Le et al., 2019)

Abbreviation defined in full AR –Autoregressive;ARCH – Autoregressive Conditional Heteroscedasticity;ARMA – Autoregressive Moving Average; ARIMA – Autoregressive Integrated Moving Average; ARMAX – Autoregressive–moving-average model with exogenous inputs; ARX – Autoregressive with Exogenous Variable;BL – Bi-Linear; BPNN – Back Propagation Neural Network; CCNN – Cascade-Correlation Neural Network; CDANN – Crisp Distributed Artificial Neural Networks; CDSVR – Crisp Distributed Support Vector Regression; DARMA – Deseasonalized Autoregressive Moving Average; DE –Differential Evolution; DLLR – Dynamic Local Linear Regression; DNN – Deep Neural Networks; DWT – Discrete Wavelet Transformation; EXPAR – Exponential Autoregressive;FARMA – Fractional Autoregressive Moving Average; FARIMA – Fractionally-differenced Autoregressive Integrated Moving Average; FCARSE – Functional-Coefficient Autoregression; FCM – Fuzzy C-Means; FDIFF – Fractional Differencing; FFNN – Feed-Forward Back Propagation Neural Network; FGN – Fractional Gaussian Noise;GAR – Gamma Autoregressive; GARCH - Generalized Autoregressive Conditional Heteroscedasticity; GP – Genetic Programming; GPR – Gaussian process regression; GRNN – Generalized Regression Neural Network; HMM – Hidden Markov Model; k-NN –k-Nearest Neighbour; LLR – Local Linear Regression; LGP – Linear Genetic Programming; LSSVN – Least-Squares Support Vector Machine; LSSVR – Least-Square Support Vector Regression; LSTM – Long Short-Term Memory; MARS – Multivariate Adaptive Regression Spline; MBB – Moving Block Bootstrap; MLR – Multiple Linear Regression; MC – Markov Chain; MSAR – Markov Switching Autoregressive; MSM – Markov Switching Model; MGGP – Multigene Genetic Programming; NAARX – Nonlinear Additive Autoregressive with Exogenous variables; NeTAR– Nested Threshold Autoregressive; MFNN – Multi-Layer Feed-Forward Neural Networks; MNITF – Multiple Nonlinear Inputs Transfer Function; ODM – Outlier Detection Model; OSELM – Online Sequential Extreme Learning Machine; PAR – Periodic Autoregressive; PARMA – Periodic Autoregressive Moving Average; PARFIMA – Periodic Autoregressive Fractional Integrated Moving Average; PARX – Periodic Autoregressive with Exogenous Variable; PCR – Principal Components Regression; PLSR – Partial Least Squares Regression; RBFNN – Radial Basis Function Neural Network; RNN – Radial Neural Network; SARIMA – Seasonal Autoregressive Integrated Moving Average; SETAR – Self-exciting Threshold Autoregressive; SETARMA – Self-Exciting Threshold Autoregressive Moving Average; SLFNs – Single Hidden Layer FeedForward Networks; SPARFIMA – Seasonal Periodic Autoregressive Fractional Integrated Moving Average; SSA – Singular Spectrum Analysis; SSM – State Space Model; STAR – Smooth transition Threshold Autoregressive; SVM – Support Vector Machine;TAR – Threshold Autoregressive; TARMA – Threshold Autoregressive Moving-Average; T-F – Thomas-Fiering; TLRN – Time Lag Recurrent Neural Network

9

FIGURE 1.4 Classification chart of data-driven models ranging from traditional to modern techniques used in streamflow forecasting.

Streamflow forecasting Chapter | 1

GPR (Sun et al., 2014)

10 Advances in Streamflow Forecasting

results obtained from ARIMA and state space models with that of MESA (Krstanovic and Singh, 1991b). Awwad et al. (1994) used ARMAX model to forecast streamflow in the Han River basin of Korea. Ooms and Franses (2001) investigated long-memory time series models namely, periodic autoregressive fractional integrated moving average (PARFIMA) and seasonal periodic autoregressive fractional integrated moving average (SPARFIMA) for forecasting monthly streamflows of Fraser River (Canada). Amendola et al. (2006) examined the performance of self-exciting threshold autoregressive moving average (SETARMA) model for forecasting daily streamflows of Jo¨kulsa´ River in North-West Iceland, Britannica and Wisconsin River near Wisconsin Dells, US. Shao et al. (2009) proposed functional coefficient time series models with a periodic component for short-term streamflow forecasting by utilizing 10-year (1979e89) daily rainfall data of Barron River and Margaret River (Australia). Amiri (2015) analyzed the capability of five classes of nonlinear time series models, namely threshold autoregressive, smooth transition autoregressive, exponential autoregressive, bilinear model, and Markov switching autoregressive to capture the dynamics in the Colorado 12-year (2000e11) daily river flow time series. Table 1.1 presents other successful applications of stochastic time series modeling in streamflow forecasting.

1.4.2 Artificial neural network For almost two decades (1970e1990s), application of time series modeling in streamflow forecasting proliferated with advances made in linear and nonlinear time series modeling (Tong, 1990). Time series models are simple to use in streamflow forecasting; however, they are poor in handling nonlinearity existing in streamflow time series (Zhang et al., 2018a,b). Hence, applications of AI-based data-driven models that address the issues of nonlinear streamflow time series were started utilizing for streamflow forecasting. ANN is one of the widely used AI-based techniques used in hydrological forecasting. About 50 years ago, McCulloch and Pitts (1943) first proposed the idea of ANN inspired by a desire to understand the human brain and emulate its functioning. The development of ANN techniques mainly began in 1970s (Hopfield, 1982), and thereafter, the application of ANN in hydrology started. Preliminary concepts and hydrological applications of the ANN along with their adoptability are very well explained in many hydrological studies such as ASCE Task Committee (2000a,b). In early nineties, the ANN was successfully applied in many hydrological studies related to rainfall-runoff modeling, streamflow forecasting, groundwater modeling, water quality, water management policy, precipitation forecasting, time series analysis, and reservoir operations, among others (ASCE Task Committee, 2000a). In 1990s, studies with application of ANN technique in streamflow forecasting started appearing in literature (e.g., Karunanithi et al., 1994; Thirumalaiah and Deo, 1998; Jain et al., 1999; Zealand et al., 1999; Coulibaly

TABLE 1.1 Details of stochastic time series modeling applications in streamflow forecasting. Year

Stochastic time series modeling techniques

1970

ARMA

St. Lawrence River (New York), Missouri River (Iowa), Neva River (USSR), and Niger River (Africa)

Annual

Carlson et al. (1970)

1971

Multiple regression model

Carson River, Eel River, and Tujunga Creek

Monthly

Bonne (1971)

1974

SARIMA

Blue River, White Cloud (Indiana)

Monthly

McKerchar and Delleur (1974)

1977

ARIMA

Saint Lawrence River at Ogdensburg (New York)

Annual

McLeod et al. (1977)

1978

ARIMA, SARIMA

Carpathian River (Poland)

Monthly

Ledolter (1978)

1979

Markov and ARMA model

Cheyenne River (United States)

Daily

Yakowitz (1979)

1985

AR, PAR, PARX, ARX

Passaic River, Whippany River, South Branch Raritan River, Flat Brook, Paulins Kill, Neshanic River, Great Egg Harbor River, Oswego River, Maurice River, Manasquan River, New Jersey

Monthly

Alley (1985)

1985

Kalman filter

Sturgeon River basin, Ontario (Canada)

Daily

Burn and McBean (1985)

1985

PAR, ARMA, SARIMA

Thirty Rivers in North and South America (Canadian Rivers)

Monthly

Noakes et al. (1985)

1988

ARMA, FGN, FARMA, FDIFF, Markov and nonparametric regression models

Saint Lawrence River, Ogdensburg (New York)

Annual

Noakes et al. (1988)

1989

Kalman filtering algorithm, PARMA model

Saugeen River, Walkerton, Ontario (Canada)

Monthly

Jimenez et al. (1989)

11

References

Streamflow forecasting Chapter | 1

Data timescale

Study area

Continued

Year

Stochastic time series modeling techniques

1990

ARMA, GAR

Twelve gauging stations from different countries (French Board River at North Carolina, Red River at North Vietnam, North Llano at central Texas, USA,Virgin at Utah, Milk River at South Alberta and north Montana, Svir at Russia, Broken at NE Australia, and Claro, Elqui, Putaendo, Aconcagua, Maipo Rivers at Chile)

Annual

Fernandez and Salas (1990)

1990

ARMA

Cauvery River at Krishna Raja Sagara Reservoir, Hemavathy River at Akiahebbal, Malaprabha River at Manoli (India)

Monthly

Mujumdar and Kumar (1990)

1991

ARIMA, state space and entropybased univariate forecasting models

River Orinoco and River Caroni (South America), River Krishna and River Godavari (India), and Spring Creek in Louisiana (USA)

Monthly and daily

Krstanovic and Singh (1991a, b)

1993

PARMA, ARMA

Niger River, Koulikoro (Africa)

Monthly

Bartolini and Salas (1993)

1994

ARMAX

Han River basin (Korea)

Monthly

Awwad et al. (1994)

1995

SARIMA

Bhadra Reservoir (India)

Monthly

Mohan and Vedula (1995)

1996

Hurst phenomenon and FARIMA

Lake Maggiore (Italy)

Daily

Burlando et al. (1996)

1996

Contemporaneous PARMA model

Ottawa River basin (Canada)

Weekly

Rasmussen et al. (1996)

1997

ARIMA, FARIMA

Lake Maggiore (Italy)

Daily and monthly

Montanari et al. (1997)

Study area

Data timescale

References

12 Advances in Streamflow Forecasting

TABLE 1.1 Details of stochastic time series modeling applications in streamflow forecasting.dcont’d

Heavy tail time series models, PARMA

Salt River near Roosevelt, Arizona

Monthly

Anderson and Meerschaert (1998)

2000

SFARIMA

Nile River (Aswan)

Monthly

Montanari et al. (1999)

2001

PARFIMA, SPARFIMA

Fraser River (Canada)

Monthly

Ooms and Franses (2001)

2005

ARMA, MS

Niagara River at Lake Erie (USA)

Annual

Akintuǧ and Rasmussen (2005)

2006

SETARMA

Jo¨kulsa´ River in North-West Iceland, Britannica and Wisconsin River near Wisconsin Dells, US

Daily

Amendola et al. (2006)

2006

NAARX, NeTAR, MNITF

Jo¨kulsa´ eystri drainage basin

Daily

Astatkie (2006)

2006

Regime-switching models, e.g., TAR, SETAR, STAR, TAR model with aggregation operators

Tatry Alpine mountain region (Slovakia)s

Monthly

Komornı´k et al. (2006)

2006

PARMA, truncated Pareto model

Fraser River at Hope (British Columbia)

Monthly

Tesfaye et al. (2006)

2008

ARMA, outlier detection model, TAR, TARMA, ARCH, and BL

Wu-Shi watershed (Taiwan)

10 days

Chen et al. (2008)

2009

FCARSE

Barron River and Margaret River (Australia)

Daily

Shao et al. (2009)

2013

ARIMA, GARCH

Matapedia River, near the Amqui basin (Canada)

Daily

Modarres and Ouarda (2013)

2015

AR, FARIMA, TAR, SETAR, STAR, EXPAR, BL, MSAR

Colorado River (USA)

Daily

Amiri (2015)

13

Continued

Streamflow forecasting Chapter | 1

1998

Year

Stochastic time series modeling techniques

Study area

Data timescale

References

2015

ARIMA, SARIMA

All states of United States

Annual

Valipour (2015)

2017

ARIMA

Pyrsalman station at Jamishan River in Kermanshah Province (Iran), Williamsburg station at Cumberland River in Kentucky State (USA), South Fork Bull Run River in the US state of Oregon, and Toston station at Missouri River in Montana State (USA)

Monthly

Moeeni et al. (2017a)

2020

ARFIMA, LR

Approximately 600 stations of North America and Europe

Annual

Papacharalampous and Tyralis (2020)

ARCH, autoregressive conditional heteroscedasticity; ARFIMA, autoregressive fractionally integrated moving average; ARIMA, autoregressive integrated moving average; ARMA, autoregressive moving average; ARMAX, autoregressive moving average model with exogenous inputs; ARX, autoregressive with exogenous variable; BL, bilinear; EXPAR, exponential autoregressive; FARIMA, fractionally differenced autoregressive integrated moving average; FARMA, fractional autoregressive moving average; FCARSE, functional coefficient autoregression; FDIFF, fractional differencing; FGN, fractional Gaussian noise; GAR, gamma autoregressive; GARCH, generalized autoregressive conditional heteroscedasticity; LR, linear regression; MNITF, multiple nonlinear inputs transfer function; MS, Markov switching; MSAR, Markov switching autoregressive; NAARX, nonlinear additive autoregressive with exogenous variables; NeTAR, nested threshold autoregressive; PAR, periodic autoregressive; PARFIMA, periodic autoregressive fractional integrated moving average; PARMA, periodic autoregressive moving average; PARX, periodic autoregressive with exogenous variable; PDAR, periodic discrete autoregressive; SARIMA, seasonal autoregressive integrated moving average; SETAR, self-exciting threshold autoregressive; SETARMA, selfexciting threshold autoregressive moving average; SFARIMA, seasonal fractional autoregressive integrated moving average; SPARFIMA, seasonal periodic autoregressive fractional integrated moving average; STAR, smooth transition threshold autoregressive; TAR, threshold autoregressive; TARMA, threshold autoregressive moving average; T-F, ThomaseFiering.

14 Advances in Streamflow Forecasting

TABLE 1.1 Details of stochastic time series modeling applications in streamflow forecasting.dcont’d

Streamflow forecasting Chapter | 1

15

et al., 2000). Karunanithi et al. (1994), probably for the first time, employed the neural network for streamflow forecasting of the Huron River in Michigan using cascade correlation (CC) algorithm for training of the network developed by Fahlman and Lebiere (1990). Thirumalaiah and Deo (1998) used CC neural network and compared it with error backpropagation (BP) and conjugate gradient (CG) algorithms in real-time forecasting of hourly runoff values of Bhasta River, Maharashtra, India. They indicated that the ANN model with BP algorithm outperformed a conventional stochastic-deterministic flow forecasting model in short-term forecasting in Winnipeg River basin, Canada (Zealand et al., 1999). From 2000s onward, many studies were conducted to improve performance of ANNs in streamflow forecasting (e.g., Coulibaly et al., 2000). Coulibaly et al. (2000) improved training of multilayer feedforward neural network by using early stopped training approach in forecasting of daily reservoir inflow in Chute-du-Diable watershed, in northern Quebec, Canada. A comprehensive review of the ANN applications in hydrologic modeling is presented in the findings of the ASCE Task Committee (2000a,b). In 2000s, data-driven techniques were coupled to develop hybrid models for improving adequacy of the forecasted streamflows. In addition, data-driven techniques were integrated with data preprocessing techniques such as wavelet transform (WT) (Admowski and Sun, 2010), SSA, moving average (MA) (Wu et al., 2009), and PSO techniques for ANN’s training (Chau, 2006). Likewise, neural network was coupled with fuzzy systems to develop counterpropagation fuzzy-neural network (CFNN) model to forecast real-time streamflow in Da-Cha River, Taiwan (Chang and Chen, 2001). Static feedforward neural network (NN) with standard BP and CG training optimization algorithms and dynamic feedback network with real-time recurrent learning algorithms were compared for rainfall-runoff modeling of Lan-Yang River in Taiwan (Chiang et al., 2004). Moradkhani et al. (2004) demonstrated that selforganizing radial basis function (RBF) of ANN outperformed the well-known multilayer feedforward network and self-organizing linear output map in simulation of daily streamflow in the semiarid Salt River basin of Colorado River basin, United States. Wang et al. (2006a) used three types of hybrid ANN models, namely, the threshold ANN, the cluster-based ANN (CANN), and periodic ANN (PANN) to forecast daily streamflows in Yellow River, located in the northeastern Tibet Plateau in China, and the model performance was compared with normal ANN models. Prada-Sarmiento and Obrego´n-Neira (2009) explored the possibility of establishing linkages between weights of simple multilayer perceptrons (MLPs) with some physical characteristics of watersheds, by means of multivariate regression analysis to obtain better forecasting results. Besaw et al. (2010) developed a method for predicting ungauged streamflow using generalized regression neural network and a CFNN model and climate and hydrologic data. Monthly river flows forecasted by ANNs in an arid inland basin of Northwest China were improved by

16 Advances in Streamflow Forecasting

presenting a spatially integrated approach of ANNs (Huo et al., 2012). Also, algorithms employed for training of the ANNs were compared in many studies in order to select the best one in streamflow forecasting. For example, Kis¸i (2007) compared the BP, CG, CC, and LevenbergeMarquardt algorithms in prediction of short-term daily streamflow for North Platte River in Colorado, United States. Mutlu et al. (2008) compared abilities of MLP and RBF neural networks to forecast daily flows at multiple gauging stations in Eucha Watershed located in north-west Arkansas and northeast Oklahoma, United States. Talaee (2014) optimized MLP networks by three training algorithms, i.e., resilient backpropagation (MLP-RP), variable learning rate (MLP-GDX), and LevenbergeMarquardt (MLP-LM), to forecast streamflow in Aspas Watershed of Fars province in southwestern Iran. In long lead time streamflow forecasting, wavelet-based neural network (WNN) outperformed the ANN in terms of forecast accuracy and precision when both WNN and ANN were combined with the ensemble method using block bootstrap sampling (BB) on the Bow River, Alberta, Canada (Kasiviswanathan et al., 2016). In 2000s, the ANN and fuzzy logic were integrated to develop hybrid neurofuzzy systems, which had the capability to solve the problem of obtaining a set of fuzzy if-then rules in fuzzy system design by using the learning capability of an ANN (Nayak et al., 2004, 2005b). Likewise, an ANFIS model, first developed by Jang (1993), has been widely used in streamflow forecasting since 2000 (El-Shafie et al., 2007; Firat and Gu¨ngo¨r, 2008; Pramanik and Panda, 2009; Sanikhani and Kis¸i, 2012; Yarar, 2014; Azad et al., 2018). Pulido-Calvo and Portela (2007) combined the ANN with ARIMA for one-step daily flow forecasting in the watershed of the Tua River in the Douro Basin (north Portugal). Adamowski and Sun (2010) proposed a method based on coupling discrete wavelet transforms (WT) and ANN, which was applied for flow forecasting at lead times of 1 and 3 days using 42-year (1965e2007) daily streamflow data of two nonperennial rivers in Cyprus (Kargotis at Evrychou and Xeros at Lazarides). Their results yielded that WT-ANN models provide more accurate streamflow forecasts than that obtained from the ANNs alone. Shiri and Kis¸i (2010) investigated applicability of hybrid wavelet neuro-fuzzy (WNF) model to forecast daily, monthly, and yearly streamflows of the Filyos River in the Western Black Sea Region of Turkey. Results indicated that the WNF models increase the accuracy of the single neuro-fuzzy models especially in forecasting yearly streamflows. It is revealed from the literature that studies involving development and application of hybrid models coupling ANN with other datadriven techniques flourished after the year 2005, and some of the studies are summarized in Table 1.2. It is felt that mostly wavelet transform (WT) technique is used in the hybrid ANN models to obtain optimized initial weights of the ANN. The WT provides useful information in both the time and frequency domains of the original signal, thereby giving better information about the physical structure of the original data. Wavelets have some attractive properties, which is the major reason for their use in analyzing streamflow time series. The

TABLE 1.2 Summary of studies that adopted hybrid technique involving artificial neural network models for streamflow forecasting. Hybridized techniques

Study area

Data timescale

References

2004

ANN, fuzzy logic

Baitarani River in Orissa state, India

Annual

Nayak et al. (2004)

2006

ANN, WT, SOM

S. Chiara at Tirso basin (Sardinia)

Monthly

Cannas et al. (2006)

2006

ANN, MLR

Kolar River in Madhya Pradesh (India)

Hourly

Chetan and Sudheer (2006)

2006

ANN, TR, FCM, PAR

Tangnaihai station at Yellow River in northeastern Tibet Plateau (China)

Daily

Wang et al. (2006a)

2007

ANN, ARIMA

Castanheiro at Tua River and Cidadelhe at Coˆa River (Portugal)

Daily

Pulido-Calvo and Portela (2007)

2008

ANN, WT

Gerdelli Station on Canakdere River and Isakoy Station on Goksudere River in the Eastern Black Sea Region (Turkey)

Monthly

Kis¸i (2008a)

2008

ANN, moving block bootstrap

Synthetic Data

Annual

Sudheer et al. (2008)

2009

ANN, WT

Kilayik station on Beyderesi River and Ru¨stu¨mko¨y station on Kocasu River (Turkey)

Monthly

Partal (2009)

2009

ANN-MA, ANN-SSA1, ANNSSA2, ANNWMRA1, and ANNWMRA2

Lushui and Daning watersheds (China)

Daily

Wu et al. (2009)

2010

WT, ANN

Kargotis River at Evrychou and Xeros River at Lazarides (Cyprus)

Daily

Adamowski and Sun (2010)

Streamflow forecasting Chapter | 1

Year

17 Continued

Data timescale

Year

Hybridized techniques

Study area

References

2010

WT, neuro-fuzzy

Filyos River in the Western Black Sea region (Iran)

Daily, monthly, annual

Shiri and Kis¸i (2010)

2011

WT, ANN, bootstrap

Mahanadi River (India)

Daily

Tiwari and Chatterjee (2011)

2012

WT, ANN

Cauvery at Kudige and Hemavathy, Karnataka (India)

Monthly

Maheswaran and Khosa (2012)

2013

WT, ANN

Huanxian, Weijiabao, and Xianyang stations at Weihe River (China)

Monthly

Wei et al. (2013)

2014

WT, ANN

C ¸ oruh River in Eastern Black Sea Region (Turkey)

Monthly

Danandeh Mehr et al. (2014a)

2014

WT, ANN

Sobradinho hydroelectric plant in Sa˜o Francisco River (Brazil)

Daily

Guimara˜es Santos and da Silva (2014)

2014

WT, neuro-fuzzy

Porsuk, Kocasu, Sakarya, Aladag, and Mudurnu stations at Sakarya Basin (Turkey)

Monthly

Yarar (2014)

2015

ANN, WT, EMD, SSA

Yangtze River (China)

Monthly

Zhang et al. (2015)

2016

ANN, WT

Railway Parade station on Ellen Brook River (Western Australia)

Daily, weekly, monthly

Badrzadeh et al. (2016)

2016

EEMD, FT, DBN

Three Gorges reservoir (China)

Daily

Bai et al. (2016)

2016

SOV, ANN

Kholyan and Kharjagil stations at Navrood River catchment located at Gilan Province (north of Iran)

Hourly

Kashani et al. (2016)

18 Advances in Streamflow Forecasting

TABLE 1.2 Summary of studies that adopted hybrid technique involving artificial neural network models for streamflow forecasting.dcont’d

ANN, WT, block bootstrap

Calgary at Bow River, Alberta (Canada)

Daily

Kasiviswanathan et al. (2016)

2016

EMD, RBF, ANN, EF

Dingjiagou station at Wuding River basin (China)

Annual

Zhang et al. (2016)

2017

GA, ANN, WT

Lighvanchai and Aghchai watersheds (Iran)

Daily, monthly

Abbaszadeh et al. (2017)

2017

ANFIS, FFA

Pahang River (Malaysia)

Monthly

Yaseen et al. (2017)

2018

CWT, MGGP, ANN

Goksu-Gokdere in Seyhan basin (Turkey)

Monthly

Hadi and Tombul (2018)

2018

ANN, WT

Sobradinho Dam located on the Sa˜o Francisco River (Northeast Brazil)

Monthly

Honorato et al. (2018)

2018

ANN, WT, EMD, EEMD

Tangnaihai station at Yellow River (northwest China)

Daily

Li et al. (2018)

2018

AEEMD, ANN

Ertan, Yichang, and Cuntan stations in the Yangtze River basin (China)

Monthly

Tan et al. (2018)

2019

ANN, SETAR, GARCH

Brantford and Galt stations on Grand River (Canada)

Monthly

Fathian et al. (2019b)

2019

ANN, EMD, EEMD, STL

Tangnaihai and Lanzhou stations located on the Yellow River, and Shigu and Yalongjiang stations located on the Yangtze River (China)

Daily

Li et al. (2019)

2019

GCA, SVR, STL, GRNN

Shigu and Xiangjiaba stations at Yangtze River (China)

Monthly

Luo et al. (2019)

2020

ANFIS, SFLA

Nong Son station and Thanh My station (Vietnam)

Monthly

Mohammadi et al. (2020)

2020

ANN, IWD

Nong Son and Thanh My stations of Vu Gia Thu Bon river basin (Vietnam)

Monthly

Pham et al. (2020)

19

ANFIS, adaptive neuro-fuzzy inference system; ANN, artificial neural network; CWT, continuous wavelet transformation; DBN, deep belief network; EEMD, ensemble empirical model decomposition; EF, external forces; EMD, empirical mode decomposition; FCM, fuzzy C-means; FFA, firefly algorithm; FT, Fourier transformation; GARCH, generalized autoregressive conditional heteroscedasticity; GCA, Grey correlation analysis; GRNN, generalized regression neural network; IWD, intelligent water drop; MA, moving average; MGGP, multigene genetic programming; MLR, multiple linear regression; PAR, periodic autoregressive; RBF, radial basis function; SETAR, self-exciting threshold autoregressive; SFLA, shuffled frog leaping algorithm; SOM, self-organizing map; SOV, second-order Volterra; SSA, singular spectrum analysis; STA, seasonal trend decomposition procedure based on loess; TR, threshold regression; WMRA, wavelet multiresolution analysis; WT, wavelet transformation.

Streamflow forecasting Chapter | 1

2016

20 Advances in Streamflow Forecasting

WT decomposes the original time series of streamflow data such that the wavelet-transformed streamflow data improve the ability of a forecasting model by capturing useful information at various resolution levels (Admowski, 2008). In a large number of studies, WT is coupled with both ANN and SVM for forecasting of streamflows.

1.4.3 Other AI techniques Over the last two and half decades, the application of the data-driven AI models for streamflow forecasting has gradually increased and received more attention in the last two decades especially from hydrologists (Hsu et al., 1995; Chang and Chen, 2001; Dorado et al., 2003; Sivapragasam and Liong, 2005; Kalteh and Berndtsson, 2007; Toth and Brath, 2007; Kentel, 2009; Samsudin et al., 2011; Jothiprakash and Magar, 2012; Valipour and Montazar, 2012a,b; He et al., 2014; Danandeh Mehr et al., 2014a,b, 2015; Terzi and Ergin, 2014; Yaseen et al., 2015; Nourani et al., 2017; Liu et al., 2017; Yaseen et al., 2018a; Kambalimath and Deka, 2021). The AI approaches are very useful in solving problems of the spatial and temporal variability of the inputs as these need low professional requirements from operators and have fast response speeds (Hejazi and Cai, 2009; Zhang et al., 2018a,b). With increased applications of the AI-based models, the major limitations highlighted by various researchers are the high risk of overfitting, local optimal solutions, and gradient disappearance, which limit the application of the model (Yang et al., 2017). A comprehensive review of AI-based techniques in the field of water resources engineering has been presented by Dawson and Wilby (2001) and Yaseen et al. (2015). Yaseen et al. (2015) presented a brief review from year 2000 to 2015 of AI-based techniques for streamflow forecasting. In streamflow forecasting, the most widely employed AI techniques are ANN (McCulloch and Pitts, 1943; Kohonen, 1988; Chiang et al., 2004; Cigizoglu, 2005), SVM (Vapnik, 1995), fuzzy logic approach (Zadeh, 1965), evolution strategies (Schwefel, 1981), GP (Koza, 1992), GEP, GA (Holland, 1975), and swarm intelligence algorithms such as PSO (Kennedy and Eberhart, 1995) or ACO (Dorigo et al., 1996), wavelet-artificial intelligence (W-AI) models (Grossmann et al., 1984), extreme learning machine (ELM), and M5 model tree (M5T). It is inferred that AI-based data-driven techniques provide a better forecasts of streamflows compared to that by other data-driven techniques including the conventional time series models and regression-based techniques (Yaseen et al., 2015, 2016b).

1.4.4 Hybrid data-driven techniques In the last three decades, several data-driven techniques ranging from traditional time series modeling to modern AI techniques have been used for streamflow forecasting by researchers from different parts of the globe.

Streamflow forecasting Chapter | 1

21

In addition, literature on data-driven streamflow forecasting is rapidly growing with more and more research studies appearing over the years. Consequently, a large number of data-driven techniques exist that have proved their efficacy in making accurate streamflow forecasts. It is felt that whenever successful application of a new data-driven technique is reported in forecasting streamflows, it attracted researchers who fine-tuned the methodology to further improve the forecasts. Many times such improvements are accomplished by combining or integrating the original or traditional data-driven technique with other AI-based techniques. When two or more data-driven approaches are coupled together, the individual strengths of each approach are utilized in a synergistic manner for the development of more robust, hybrid, and intelligent systems (See and Openshaw, 2000). One major advantage of the hybrid modeling is that it combines streamflow data pre/postprocessing scheme with the AI-based techniques, which plays an auxiliary role for improving the forecasting accuracy. Hence, the modern hybrid approaches have a huge capability to model and forecast linear stationary as well as nonlinear and nonstationary streamflow datasets with a large accuracy. First attempt to forecast the streamflow using the multimodel combination approach was made by Shamseldin et al. (1997) where the authors proposed a combination of three models, namely the simple average method, the weighted average method, and the neural network method. For application of these models, 11 catchments were chosen from different countries in the world and the results showed an evident outcome in terms of more accurate combined outcomes than that obtained from individual models of the river flow time series. Successful application of the multimodel combination approach opened the door for the researchers to explore the applicability of this modern approach in various applications in hydrology. Since then, hybrid approach is adopted in a number of research studies, which have dealt with a multimodel combination to improve forecasting accuracy (Xiong et al., 2001; Coulibaly et al., 2005; Kalteh, 2013; Wu et al., 2015). Summary of the studies that adopted hybridized approaches of statistical time series modeling in streamflow forecasting is presented in Table 1.3. Few of the studies reported during last one decade that adopted hybrid data-driven models (excluding ANN) in streamflow forecasting are summarized in Table 1.4. A classification chart showing hybrid data-driven models is presented in Fig. 1.5.

1.5 Comparison of different data-driven techniques Many studies evaluated comparative performance of different data-driven AI techniques in accurately and precisely forecasting the streamflows. During the last two decades, it has become a customary approach to employ multiple data-driven techniques for streamflow forecasting. It is also learnt that researchers are utilizing many data-driven AI techniques for streamflow forecasting, which have proved their adequacy in predicting other hydrologic and

Data timescale

Year

Hybrid techniques

Study area

References

2000

ARMA, MBB

Fort Benton station on Missouri River in Montana, Petrokrepost station on Neva River in USSR, Outflow station on Lake Victoria River in Uganda and Mongalla station on Lake Albert River in Sudan

Annual

Srinivas and Srinivasan (2000)

2001

MBB, PARMA

Weber River near Oakley, UT, and Beaver River near Beaver, UT

Annual

Srinivas and Srinivasan (2001)

2003

MPAR, Lane’s condensed disaggregation, MCPAR

Three subbasins of Han River basin (Korea)

Monthly

Rieu et al. (2003)

2005

ARMA, GARCH

Tangnaihai station at Yellow River (China)

Daily, monthly

Wang et al. (2005)

2009

ANN, MA, SSA1, SSA2, WMRA1, and WMRA2

Lushui and Daning watersheds of Yangtze River (China)

Daily

Wu et al. (2009)

2011

SSA, LRF, and ARIMA

Biliuhe and Dahuofang reservoirs in Liaoning Province (China)

Annual

Zhang et al. (2011)

2015

ARIMA, EEMD

Biliuhe, Dahuofang, and Mopanshan reservoirs (China)

Annual

Wang et al. (2015)

2016

ARMA, GARCH

Yangtze River (China)

Daily

Xie et al. (2016)

2016

SARIMA, ANN

Pirsalman station at Jamishan River (Iran)

2017

ARIMA, NARX

Dez Reservoir (Iran)

Moeeni and Bonakdari (2017) Daily

Banihabib et al. (2018)

22 Advances in Streamflow Forecasting

TABLE 1.3 Summary of studies that adopted hybridized statistical time series modeling in streamflow forecasting.

2017

SARIMA, GEP, and ANN

Jamishan Dam (Iran)

Monthly

Moeeni et al. (2017b)

2018

HMM, GMR

Three hydrometric stations at upper Yangtze River (China)

Monthly

Liu et al. (2018)

2018

EMD, EEMD, and ARIMA

Tangnaihai station at Yellow River (China)

Daily

Wang et al. (2018)

2019

SETAR, GARCH

Six stations located upstream of Zarrineh Rood River (Iran)

Daily

Fathian et al. (2019a)

2019

ANN, MARS, SETAR, and GARCH

Brantford and Galt situated on Grand River (Canada)

Monthly

Fathian et al. (2019b)

2019

GEP, MARS, MLR, FARIMA, and SETAR

Umpqua and Ocmulgee River stations, located in Georgia and Oregon (USA)

Monthly

Mehdizadeh et al. (2019)

Streamflow forecasting Chapter | 1

ANN, artificial neural network inputs; EEMD, ensemble empirical mode decomposition; EMD, empirical mode decomposition; GEP, gene expression programming; GMR, Gaussian mixture regression; HMM, hidden Markov model; LRF, linear recurrent formulae; MA, moving average; MARS, multivariate adaptive regression splines; MBB, moving block bootstrap; MCPAR, multivariate contemporaneous periodic autoregressive; MLR, multiple linear regression; MPAR, multivariate periodic autoregressive; NARX, nonlinear autoregressive model with exogenous; SETAR, self-exciting threshold autoregressive; SSA, singular spectrum analysis; WMRA, wavelet multiresolution analysis.

23

Data timescale

Year

Hybrid techniques

Study area

Country

References

2013

MWVC, BMA

Selway River Nr Lowell ID and St. Joe River at Calder ID

USA

Daily, weekly, monthly

Rathinasamy et al. (2013)

2014

EMD, SVR

Wei River basin

China

Monthly

Huang et al. (2014)

2014

WT, SVR

East Fork White River, near Bedford, and Eel River, near Logansport, Indiana

United States

Daily, monthly

Liu et al. (2014)

2015

WT, GEP

Baihe (China), Brosna (Ireland), (Nan) Thailand, Yanbian (China)

China, Ireland, Thailand

Daily

Shoaib et al. (2015)

2016

MVRVM, WT

Yellowstone River in the Uinta Basin in Utah

United States

Monthly

Maslova et al. (2016)

2017

MA, MGGP

Senoz Stream located in Rize Province

Turkey

Daily

Danandeh Mehr and Kahya (2017)

2017

WT, LGP

Pataveh and Shah Mokhtar, on the Beshar River at the Yasuj

Iran

Monthly

Ravansalar et al. (2017)

2018

GEP, GA

Givi station at Shavir Creek in Sefidrood River basin

Iran

Monthly

Danandeh Mehr (2018)

2018

CT, MGGP

Eherchay and Livan stations in East Azerbaijan province of Iran, and Sogutluhan and Yamula stations on Kizilirmak River in central Anatolia, Turkey

Iran, Turkey

Daily

Ghorbani et al. (2018)

2018

HMM, GMR

Yichang station at Yangtze River, Wulong station on Wu River, and Beibei station on Jialing River

China

Monthly

Liu et al. (2018)

24 Advances in Streamflow Forecasting

TABLE 1.4 Brief review of studies that adopted hybrid techniques (excluding artificial neural network models) for streamflow forecasting.

WT, ELM

Baghdad station at Tigris River

Iraq

Monthly

Yaseen et al. (2018a)

2019

Mars, DE

Tigris River

Iraq

Monthly

Al-Sudani et al. (2019)

2019

SETAR, GARCH

Six stations located upstream of Zarrineh Rood dam

Iran

Daily

Fathian et al. (2019a)

2019

MARS, RF, SETAR, GARCH

Brantford and Galt stations on Grand River

Canada

Monthly

Fathian et al. (2019b)

2019

VIC, AR, ARMAX, WNN, WNARX

Hirakud reservoir catchment of the Mahanadi River basin

India

Daily

Nanda et al. (2019)

2019

VMD, DBN, IPSO

Yangxian and Ankang stations in Han River basin

China

Daily

Xie et al. (2019)

2020

RF, DAE, SVR

Bookan dam located in the Lake Urmia basin

Iran

Monthly

Abbasi et al. (2020)

2020

LASSO, FCM, DBN

Knoxville and Franklin stations at Tennessee River

USA

Monthly

Chu et al. (2020)

2020

VMD, SVM, PSO

Three Gorges and Danjiangkou reservoirs in Yangtze Valley

China

Monthly

Feng et al. (2020)

2020

MOVE types 1e4, KTRL, KTRL2, RLOC, WT

Sixty-seven pairs of target-index stations from Reference Hydrometric Basin Network database

Canada

Monthly

Nalley et al. (2020)

2020

EGB, GMM

Cuntan and Hankou stations on Yangtze River basin

China

Monthly

Ni et al. (2020)

2021

BI, LSSVM, FA

Kerki and Kizilyar stations at Amu Darya Basin

Uzbekistan

Seasonal, annual

Gao et al. (2021)

25

BI, Bayesian inference; BMA, Bayesian model average; DAE, deep auto-encoder; DBN, deep belief networks; DE, differential evolution; EGB, extreme gradient boosting; ELM, extreme learning machine; FA, factorial analysis; FCM, fuzzy C-means; GA, genetic algorithm; GARCH, generalized autoregressive conditional heteroscedasticity; GEP, gene expression programming; GMM, Gaussian mixture model; IPSO, improved particle swarm optimization; KTRL, KendalleTheil robust line; LASSO, least absolute shrinkage and selection operator; LGP, linear genetic programming; LSSVM, least squares support vector machine; MA, moving average; MARS, multivariate adaptive regression splines; MGGP, multigene genetic programming; MOVE, maintenance of variance; MVRVM, multivariate relevance vector machine; MWVC, Multiwavelet Volterra coupled; PSO, particle swarm optimization; RF, random forest; RLOC, robust line of organic correlation; SETAR, self-exciting threshold autoregressive; SVM, support vector machine; SVR, support vector regression; VMD, variational mode decomposition; WT, wavelet transformation.

Streamflow forecasting Chapter | 1

2018

26 Advances in Streamflow Forecasting

FIGURE 1.5 Chart showing classification of hybrid data-driven models.

nonhydrologic variables satisfactorily. Moreover, accuracy of the newly employed AI technique as well as multiple AI techniques is compared with that of the original or proved AI technique to adjudge capability and comment on reliability of the former in forecasting streamflows. Few of the earlier

Streamflow forecasting Chapter | 1

27

studies made a comparison between ANN and stochastic time series approaches in making streamflow forecasts (e.g., Raman and Sunilkumar, 1995; Thirumalaiah and Deo, 2000; Kis¸i, 2003, 2005). Later on, with start of the ANN applications in streamflow forecasting, performance of the ANN was compared with that of conventional ARIMA techniques (e.g., Jain et al., 1999; Castellano-Me´ndeza et al., 2004; Valipour et al., 2013). It was further revealed that the ANN technique can be used more successfully than the other traditional statistical techniques in modeling and forecasting of the streamflow data (e.g., Smith and Eli, 1995; Minns and Hall, 1996). Admowski (2008) explored use of WT and cross-WT as a standalone technique in forecasting daily streamflows at Parzen station for the Skrwa Prawa River in Poland and compared with ANN models and simple perseverance models. Wang et al. (2009) examined five AI-based techniques such as ARMA, ANN, ANFIS, GP, and SVM modeling to forecast inflows of Lancangjiang River, China, by utilizing 52-year (1953e2004) monthly inflow data. Performance of the AI models was evaluated by employing four statistical measures, i.e., correlation coefficient, NasheSutcliffe efficiency, root mean square error, and mean absolute percentage error, and results indicated that SVM model gave the better forecasting accuracy. Shabri and Suhartono (2012) introduced least squares support vector machine (LSSVM) model for forecasting streamflow at Tualang and Rambutan stations located on the Kinta River in Perak, Peninsular Malaysia, using 30-year (October 1976 to July 2006) and 41-year (January 1961 to December 2002) data, respectively. They compared performance of LSSVM model with that of three conventional models, i.e., ARIMA, ANN, and SVM models, using four statistical measures viz. mean absolute error (MAE), root mean squared error (MSE), correlation coefficient (R), and NasheSutcliffe coefficient of efficiency (CE). The forecasting results indicated that LSSVM model performed better than the other models at both the stations. Wu and Chau (2010) applied four data-driven techniques such as ARMA, K-nearest neighbors (KNN), ANN, and phase space reconstruction-based ANN (PSR-ANN) for the prediction of monthly streamflows at four stations located in Xiangjiaba, Cuntan, Manwan, and Danjiangkou, China. Results of the study suggested that KNN model performs the best among the four models. Valipour et al. (2013) compared the capability of ARMA, ARIMA, and AR-ANN models to forecast 47 years (1960e2007) of monthly inflows of Dez reservoir at Taleh Zang station, Iran. The results revealed that the ARIMA model has less error than the ARMA model and the AR-ANN model has less error than the ARIMA model. Adnan et al. (2020) compared the accuracy of four heuristic methods, i.e., optimally pruned extreme learning machine (OP-ELM) with the LSSVM, multivariate adaptive regression splines (MARS), and M5 treeebased model, in estimating streamflow of two hydrometric stations, namely, Kalam and Chakdara on Swat River basin, Pakistan. Based on the overall results, the LSSVM and MARS are recommended for monthly streamflow prediction.

28 Advances in Streamflow Forecasting

1.6 Current trends in streamflow forecasting Many different data-driven techniques such as AI models have been preferred in streamflow forecasting over physically based hydrological models and traditional stochastic time series models since 1990s. This is because of the better forecasts obtained from the data-driven models due to their better accuracy and less computational time. However, data-driven techniques used in streamflow forecasting have been continuously progressing for the last two decades by dedicated efforts of hydrologists and water resource managers who are motivated to explore potential of many newly developed intelligent models (Al-Sudani et al., 2019). Despite the fact that the data-driven models are highly efficient in making reliable streamflow forecasts, they are sensitive and possess uncertainty due to poor quality of data or predictors, inappropriate selection of predictor variables, and inability of model to capture physical process (Abbasi et al., 2020). Furthermore, the data-driven models have the limitation of incapability to deal with nonstationary streamflow data if preprocessing of input and/or output data is not performed (Cannas et al., 2006). The above limitations of data or predictor variables in the data-driven models have been overcome by employing an additional set of data-driven techniques called time-frequency decomposition techniques such as wavelet transform (discrete and continuous), SSA, empirical mode decomposition, ensemble empirical model decomposition, variational model decomposition, etc., which are used as preprocessing of data. The uncertainties are further reduced by applying data-driven preprocessing approaches for choosing a suitable set of predictors (Bowden et al., 2005), which include forward selection (Wang et al., 2006b), mutual information (Abdourahamane et al., 2019), and random forest (Abbasi et al., 2020). Hence, in recent studies on streamflow forecasting, it has been a customary practice to adopt two data-driven techniques and combine them in order to improve the accuracy of forecasts. This integration of two or more techniques is called hybridization, which has now become a current trend in studies involving task of streamflow forecasting. The hybrid data-driven models take advantage of black-box or AI-based models and their ability to efficiently describe observed data in statistical terms, as well as other prior information, concealed in observed records. Moreover, hybrid models represent the joint application of AI-based methods with preprocessing methods to enhance overall model performance (Nourani et al., 2014). Nowadays, the traditional time series models are also coupled with soft computing AI techniques to generate a new form of hybrid models in a lot of studies, which yielded reasonable accuracy and promising results in streamflow forecasts. For example, Dariane et al. (2018) developed a new hybrid technique namely entropy model coupled with input selection approach and wavelet transformation for long-term streamflow forecasting with 10 years lead time. They utilized 45 years’ of monthly streamflow data of Taleghan basin, Tehran, and concluded that the hybrid model substantially improved the results of the basic

Streamflow forecasting Chapter | 1

29

entropy. Yaseen et al. (2018b) examined a new hybrid model that integrates novel data preprocessing method namely the rolling mechanismegrey model with ANN algorithm to forecast the streamflow. These models are developed by utilizing daily streamflow data from 1989 to 2008 for Johor river and from 1995 to 2014 for Kelantan river, Malaysia. The results revealed that the hybrid model not only smoothened the data but also characterized the unknown information particularly with limited and missing data. More recently, Tikhamarine et al. (2020) employed an effective hybrid model by integrating a new metaheuristic optimization algorithm namely grey wolf optimization (GWO) algorithm with AI models such as SVM, MLP neural network, and autoregressive model for monthly streamflow forecasting of Aswan High Dam on the Nile River, Egypt, utilizing 130 years’ (1871 and 2000) data. The results indicated that the integrated AI models with GWO proved to be more accurate and effective compared with the standard AI techniques. It was also found from the comparative performance evaluation that the integrated SVR-GWO model outperformed the ANN-GWO and MLR-GWO models.

1.7 Key challenges in forecasting of streamflows Streamflow forecasting is found to be a very challenging task by the forecasters and water managers mainly due to the presence of randomness in the streamflow datasets. It is a well-known fact that streamflow has been quite difficult technically to predict or forecast, and the plausible reasons are its complex, nonlinear, dynamic, and chaotic behavior, along with randomness present in its historical datasets (Dietrich, 1987; Wen, 2009; Farmer, 2016). This complex hydrologic process is affected by many factors such as topography, rainfall distribution, soil properties, land use, climate change, urban development, etc. (Yaseen et al., 2019a). Hence, streamflow forecasting is subjected to various types of sources of errors, which need to be estimated and removed wherever possible from the streamflow forecasts (Liu et al., 2011). In both process-based and data-driven modeling of streamflows, successful and accurate forecasts depend highly on development of a reliable model with satisfactory fitting, which requires high-quality streamflow data of adequate time period (Chau and Jin, 1998; Chau and Jiang, 2002). Presently, streamflow data are available for several catchments of the world where gauging was started many years ago. However, still there exist many ungauged catchments over the globe especially in developing countries where unavailability or limited availability of historical streamflow data poses a major limitation in streamflow forecasting. In many cases, even when historical dataset of streamflow are found available, reliability of data plays major role in satisfactory model development and forecasting. Another challenge in accurate forecasting of streamflow is selection of optimum time scale of data. Sometimes, monthly and/or annual streamflow data of a reservoir or river are easily available instead of daily data, which restricts precise short-time prediction of

30 Advances in Streamflow Forecasting

streamflow. On the other hand, it is reported that the daily streamflow data having more irregular fluctuations than monthly data are often not accurate, and instead, monthly streamflow estimates are reasonable (Wang et al., 2011). Thus, modeling of monthly streamflows is more efficient in providing promising forecasts than that of daily data. Thus, streamflow modeling and forecasting are useful to optimize the water resources system and to mitigate the impact of destructive natural disasters such as floods and droughts by implementing long-term planning (structural and nonstructural measures) and shortterm emergency warning. Also, streamflows in many large reservoir systems are not directly measured, and thus, these have to be estimated through water balance or by other means. Such estimates of streamflows may not be considered absolutely reliable to build a reasonable model for forecasting. In data-driven modeling, one major concern in streamflow forecasting is whether the underlying process should be modeled as linear or nonlinear (Modarres and Ouarda, 2013). In addition to qualitative and quantitative data limitations, challenges in streamflow forecasting are due to development of inaccurate and less precise models. During the last three decades, hydrologists, water managers, and forecasters have worked hard to improve the forecasting accuracy of datadriven models by adopting several tools and techniques with a wide variety of computational algorithms and advancing the subject of streamflow modeling. Many researchers attempted to address the issue of inaccuracy in making both short-term and long-term datasets using varying methodological approaches either process-based or data-driven. Accuracy of the physically based models is often limited due to assumptions of normality, linearity, and variable independence. Conversely, the conceptual models require minimum information, have deep learning ability, have rapid development times, and are extremely suitable for simulating the complex process. The conceptual or parametric models are able to capture nonlinear relationships between predictors and simulated variables of streamflows without a need for a preknowledge of the flow physics and advantageously by utilizing minimum information of historical hydrological inputs (Azmi and Sarmadi, 2016; Maier et al., 2014). In case of physically based models, huge research efforts are made to improve the accuracy of simulated time series of streamflow forecasts for the particular river basins by expanding many hydrological input variables, which in turn enhanced the difficulty of implementing the model in data-scarce conditions. The data-driven models overcome most of the challenges experienced by the physically based and conceptual models. In context of the classical time series models, prediction or forecasting methods are generally based on the analysis, representation, and projection of existing time series data (Cryer, 1986). Reliable forecast of streamflow is particularly difficult to obtain in time series modeling when the target system is governed by dynamic processes under chaotic and stochastic conditions. To deal with the challenges of traditional approaches like statistical tools including time series modeling,

Streamflow forecasting Chapter | 1

31

AI approaches like ANN models have been adopted by practicing hydrologists and the same continues to be undertaken over the last 20 years for short/longterm streamflow forecasting. The forecast accuracy of the ANN models is further improved by putting a huge research effort in pre/postprocessing of the input/output variables. When the streamflow data are highly nonstationary, AI techniques may not be able to simulate and forecast streamflows without pre/ postprocessing of the input/output datasets and give slow convergence speed, less generalizing performance, arriving at local minimum and overfitting problems (Luo et al., 2019). Recently, in such situations, a relatively new learning approach namely hybrid technique is proposed, which combines data preprocessing and AI techniques. The hybrid technique is being increasingly applied as an important tool to improve the forecast accuracy of streamflow. In future, it would be challenging to convert the broad and uncertain predictions of climate change into corresponding predictions of streamflow patterns.

1.8 Concluding remarks Streamflow is an important component of hydrologic cycle, and its measurement or estimation is not easy due to complex, nonlinear, dynamic, and chaotic behavior of the process. Forecasting of streamflow is also one of the challenging tasks often confronted by the hydrologists. Streamflow forecasting involves use of historical data in development of either physically based whitebox models or data-driven black-box models. Development of physically based models requires huge quantum of a variety of data related to various physical processes to understand the hydrologic system, which is mostly difficult to obtain under real-field situations especially in developing countries. However, data-driven models do not require knowledge about the system, and hence, they have been quite popular mainly after advancements in soft computing technology. Traditionally, stochastic time series models are used in forecasting streamflows. In 1990s, ANN models gained the momentum in streamflow forecasting due to their better accuracy than that of the time series models. In addition, other AI techniques including SVM, fuzzy logic, GP, GEP, etc., are employed for streamflow forecasting. Later on, performance of the existing AI-based data-driven models is enhanced by integrating them with other data-driven techniques called preprocessing techniques that include various decomposition techniques, i.e., wavelet transformation, singular spectral analysis, empirical mode decomposition, variational mode decomposition, among others. This integration of two or more number of data-driven techniques yielded several hybrid models that further boosted accuracy of streamflow forecasts. It is revealed that hybridization of data-driven techniques has become a customary approach in current studies reported in the last 5 years dealing with streamflow forecasting and prediction. In recent times, researchers are attempting to adopt more and more new AI-based data-driven techniques for

32 Advances in Streamflow Forecasting

preprocessing of streamflow data prior to using the other AI techniques for forecasting. There is also a tendency in the recent studies to comparatively evaluate performance of multiple data-driven techniques. It is apparent from literature that findings of different streamflow forecasting studies vary widely as there is not a single data-driven technique that performs the best under all situations. Unavailability of good-quality streamflow data of an adequate length poses the major limitation in modeling and forecasting studies. Selection of optimum time scale is also a crucial factor in making reasonable forecasts. Hence, efforts should be made to develop gauging stations for streamflow monitoring at basin scale in different parts of the world. Uncertainty in streamflow forecasts due to an employed data-driven model has been considerably removed with introduction of hybrid models. Finally, it is concluded that there is a vast scope for improving streamflow forecasting methods using modern tools and techniques along with location-specific in-depth knowledge of the hydrologic system. Moreover, it is felt that there is a need to combine physically based models with data-driven models in making streamflow forecasts for taking advantages of both black-box and white-box models.

References Abbasi, M., Farokhnia, A., Bahreinimotlagh, M., Roozbahani, R., 2020. A hybrid of random forest and deep auto-encoder with support vector regression methods for accuracy improvement and uncertainty reduction of long-term streamflow prediction. J. Hydrol. 125717. https://doi.org/ 10.1016/j.jhydrol.2020.125717. Abbaszadeh, P., Alipour, A., Asadi, S., 2017. Development of a coupled wavelet transform and evolutionary Levenberg-Marquardt neural networks for hydrological process modeling. Comput. Intell. 34 (1), 175e199. https://doi.org/10.1111/coin.12124. Abdourahamane, Z.S., Acar, R., Serkan, S., 2019. Wavelet-copula-based mutual information for rainfall forecasting applications. Hydrol. Process. 33, 1127e1142. https://doi.org/10.1002/ hyp.13391. Abrahart, R.J., See, L., 2000. Comparing neural network and autoregressive moving average techniques for the provision of continuous river flow forecasts in two contrasting catchments. Hydrol. Process. 14 (11e12), 2157e2172. https://doi.org/10.1002/1099-1085(20000815/30) 14:11/123.0.CO;2-S. Abudu, S., King, J.P., Pagano, T.C., 2010. Application of partial least-squares regression in seasonal streamflow forecasting. J. Hydrol. Eng. 15 (8), 612e623. https://doi.org/10.1061/ (ASCE)HE.1943-5584.0000216. Adamowski, J., Sun, K., 2010. Development of a coupled wavelet transform and neural network method for flow forecasting of non-perennial rivers in semi-arid watersheds. J. Hydrol. 390 (1e2), 85e91. https://doi.org/10.1016/j.jhydrol.2010.06.033. Admowski, J.F., 2008. River flow forecasting using wavelet and cross-wavelet transform models. Hydrol. Process. 22, 4877e4891. https://doi.org/10.1002/hyp.7107. Adnan, R.M., Liang, Z., Heddam, S., Zounemat-Kermani, M., Kis¸i, O., Li, B., 2020. Least square support vector machine and multivariate adaptive regression splines for streamflow prediction in mountainous basin using hydrometeorological data as inputs. J. Hydrol. 586, 124371. https://doi.org/10.1016/j.jhydrol.2019.124371.

Streamflow forecasting Chapter | 1

33

Akintu g, B., Rasmussen, P.F., 2005. A Markov switching model for annual hydrologic time series. Water Resour. Res. 41 (9) https://doi.org/10.1029/2004WR003605. Aksoy, H., 2003. Markov chain-based modeling techniques for stochastic generation of daily intermittent streamflows. Adv. Water Resour. 26 (6), 663e671. https://doi.org/10.1016/ S0309-1708(03)00031-9. Al-Juboori, A.M., 2021. A hybrid model to predict monthly streamflow using neighboring rivers annual flows. Water Resour. Manag. 1e15. https://doi.org/10.1007/s11269-020-02757-4. Al-Sudani, Z.A., Salih, S.Q., Sharafati, A., Yaseen, Z.M., 2019. Development of multivariate adaptive regression spline integrated with differential evolution model for streamflow simulation. J. Hydrol. 573, 1e12. https://doi.org/10.1016/J.JHYDROL.2019.03.004. Alley, W.M., 1985. Water balance models in one-month-ahead streamflow forecasting. Water Resour. Res. 21 (4), 597e606. https://doi.org/10.1029/WR021i004p00597. Amendola, A., Niglio, M., Vitale, C., 2006. Multi-step SETARMA predictors in the analysis of hydrological time series. Phys. Chem. Earth, Parts A/B/C 31 (18), 1118e1126. https://doi.org/ 10.1016/j.pce.2006.04.040. Amiri, E., 2015. Forecasting daily river flows using nonlinear time series models. J. Hydrol. 527, 1054e1072. https://doi.org/10.1016/j.jhydrol.2015.05.048. Amorocho, J., Brandstetter, A., 1971. Determination of nonlinear functional response functions in rainfall-runoff processes. Water Resour. Res. 7 (5), 1087e1101. https://doi.org/10.1029/ WR007i005p01087. Anderson, T.W., 1971. The Statistical Analysis of Time Series. Wiley, New York. Anderson, P.L., Meerschaert, M.M., 1998. Modeling river flows with heavy tails. Water Resour. Res. 34 (9), 2271e2280. https://doi.org/10.1029/98WR01449. Aqil, M., Kita, I., Yano, A., Nishiyama, S., 2007. A comparative study of artificial neural networks and neuro-fuzzy in continuous modeling of the daily and hourly behaviour of runoff. J. Hydrol. 337, 22e34. https://doi.org/10.1016/J.JHYDROL.2007.01.013. Araghinejad, S., Fayaz, N., Hosseini-Moghari, S.M., 2018. Development of a hybrid data driven model for hydrological estimation. Water Resour. Manag. 32 (11), 3737e3750. https://doi.org/ 10.1007/s11269-018-2016-3. ASCE Task Committee, 2000. Artificial neural networks in hydrology. I: preliminary concepts by the ASCE task committee on application of artificial neural networks in hydrology. J. Hydrol. Eng. ASCE 5 (2), 115e123. https://doi.org/10.1061/(ASCE)1084-0699(2000)5:2(115). ASCE Task Committee, 2000. Artificial neural networks in hydrology. II: hydrologic applications by the ASCE task committee on application of artificial neural networks in hydrology. J. Hydrol. Eng. ASCE 5 (2), 124e137. https://doi.org/10.1061/(ASCE)1084-0699(2000)5:2(124). Astatkie, T., 2006. Absolute and relative measures for evaluating the forecasting performance of time series models for daily streamflows. Nord. Hydrol. 37 (3), 205e215. Awwad, H.M., Valde´s, J.B., Restrepo, P.J., 1994. Streamflow forecasting for Han River basin, Korea. J. Water Resour. Plann. Manag. ASCE 120 (5), 651e673. https://doi.org/10.1061/ (ASCE)0733-9496(1994)120:5(651). Azad, A., Farzin, S., Kashi, H., Sanikhani, H., Karami, H., Kis¸i, O., 2018. Prediction of river flow using hybrid neuro-fuzzy models. Arab. J. Geosci. 11 (22), 1e14. https://doi.org/10.1007/ s12517-018-4079-0. Azmi, M., Sarmadi, F., 2016. Improving the accuracy of K-nearest neighbour method in long-lead hydrological forecasting. ScientiaIranica. Trans. A, Civil Eng. 23, 856e863. Babovic, V., Keijzer, M., 2002. Rainfall runoff modelling based on genetic programming. Nord. Hydrol. 33 (5), 331e346. https://doi.org/10.2166/NH.2002.0012.

34 Advances in Streamflow Forecasting Badrzadeh, H., Sarukkalige, R., Jayawardena, A.W., 2013. Impact of multi-resolution analysis of artificial intelligence models inputs on multi-step ahead river flow forecasting. J. Hydrol. 507, 75e85. https://doi.org/10.1016/j.jhydrol.2013.10.017. Badrzadeh, H., Sarukkalige, R., Jayawardena, A.W., 2016. Improving ANN-based short-term and long-term seasonal river flow forecasting with signal processing techniques. River Res. Appl. 32, 245e256. https://doi.org/10.1002/rra.2865. Bai, Y., Bezak, N., Sapac, K., Klun, M., Zhang, J., 2019. Short-term streamflow forecasting using the feature-enhanced regression model. Water Resour. Manag. 33 (14), 4783e4797. https:// doi.org/10.1007/s11269-019-02399-1. Bai, Y., Chen, Z., Xie, J., Li, C., 2016. Daily reservoir inflow forecasting using multiscale deep feature learning with hybrid models. J. Hydrol. 532, 193e206. https://doi.org/10.1016/ j.jhydrol.2015.11.011. Banihabib, M.E., Ahmadian, A., Valipour, M., 2018. Hybrid MARMA-NARX model for flow forecasting based on the large-scale climate signals, sea-surface temperatures, and rainfall. Nord. Hydrol. 49 (6), 1788e1803. https://doi.org/10.2166/NH.2018.145. Bartolini, P., Salas, J.D., 1993. Modeling of streamflow processes at different time scales. Water Resour. Res. 29 (8), 2573e2587. https://doi.org/10.1029/93WR00747. Besaw, L.E., Rizzo, D.M., Bierman, P.R., Hackett, W.R., 2010. Advances in ungauged streamflow prediction using artificial neural networks. J. Hydrol. 386, 27e37. https://doi.org/10.1016/ j.jhydrol.2010.02.037. Bonne´, J., 1971. Stochastic simulation of monthly streamflow by a multiple regression model utilizing precipitation data. J. Hydrol. 12 (4), 285e310. https://doi.org/10.1016/00221694(71)90027-8. Bourdin, D.R., Fleming, S.W., Stull, R.B., 2012. Streamflow modelling: a primer on applications approaches and challenges. Atmos.-Ocean 50 (4), 507e536. https://doi.org/10.1080/ 07055900.2012.734276. Bowden, G.J., Maier, H.R., Dandy, G.C., 2005. Input determination for neural network models in water resources applications. Part 2. Case study: forecasting salinity in a river. J. Hydrol. 301, 93e107. https://doi.org/10.1016/j.jhydrol.2004.06.020. Box, G.E.P., Jenkins, G.M., 1970. Time Series Analysis: Forecasting and Control. San Francisco Holden Day (Revised Ed. 1976). Brown, R.G., 1959. Statistical Forecasting for Inventory Control. New York McGraw-Hill, p. 402. Brown, R.G., 1963. Smoothing, Forecasting and Prediction of Discrete Time Series. NJ PrenticeHall, Englewood Cliffs, p. 468. https://doi.org/10.2307/2344012. Burlando, P., Montanari, A., Rosso, R., 1996. Modelling hydrological data with and without long memory. Meccanica 31 (1), 87e101. Burn, D.H., McBean, E.A., 1985. River flow forecasting model for Sturgeon River. J. Hydraul. Eng. ASCE 111 (2), 316e333. https://doi.org/10.1061/(ASCE)0733-9429(1985)111:2(316). Cannas, B., Fanni, A., See, L., Sias, G., 2006. Data preprocessing for river flow forecasting using neural networks: wavelet transforms and data partitioning. Phys. Chem. Earth, Parts A/B/C 31 (18), 1164e1171. https://doi.org/10.1016/J.PCE.2006.03.020. Carlson, R.F., MacCormick, A.J.A., Watts, D.G., 1970. Application of linear models to four annual streamflow series. Water Resour. Res. 6 (4), 1070e1078. https://doi.org/10.1029/ WR006i004p01070. Castellano-Me´ndez, M., Gonza´lez-Manteiga, W., Febrero-Bande, M., Prada-Sa´nchez, J.M., Lozano-Caldero´n, R., 2004. Modelling of the monthly and daily behaviour of the runoff of the Xallasriver using BoxeJenkins and neural networks methods. J. Hydrol. 296 (1e4), 38e58. https://doi.org/10.1016/j.jhydrol.2004.03.011.

Streamflow forecasting Chapter | 1

35

Chang, F.-J., Chen, Y.-C., 2001. A counterpropagation fuzzy-neural network modeling approach to real time streamflow prediction. J. Hydrol. 245, 153e164. https://doi.org/10.1016/S00221694(01)00350-X. Chatfield, C., Yar, M., 1988. Holt-Winters forecasting: some practical issues. J. Roy. Stat. Soc.: Series D (The Statistician) 37 (2), 129e140. https://doi.org/10.2307/2348687. Chau, K.W., 2006. Particle swarm optimization training algorithm for ANNs in stage prediction of Shing Mun River. J. Hydrol. 329 (3e4), 363e367. https://doi.org/10397/1194. Chau, K.W., Jiang, Y.W., 2002. Three-dimensional pollutant transport model for the pearl river estuary. Water Res. 36 (8), 2029e2039. https://doi.org/10.1016/s0043-1354(01)00400-6. Chau, K.W., Jin, H., 1998. Eutrophication model for a coastal bay in Hong Kong. J. Environ. Eng. 124 (7), 628e638. https://doi.org/10.1061/(ASCE)0733-9372(1998)124:7(628). Chau, K.W., Wu, C.L., Li, Y.S., 2005. Comparison of several flood forecasting models in Yangtze river. J. Hydrol. Eng. ASCE 10, 485e491. https://doi.org/10.1061/(ASCE)1084-0699(2005) 10:6(485). Chen, X.Y., Chau, K.W., Wang, W.C., 2015. A novel hybrid neural network based on continuity equation and fuzzy pattern-recognition for downstream daily river discharge forecasting. J. Hydroinf. 17 (5), 733e744. https://doi.org/10.2166/HYDRO.2015.095. Chen, C.S., Liu, C.H., Su, H.C., 2008. A nonlinear time series analysis using two-stage genetic algorithms for streamflow forecasting. Hydrol. Process. 22 (18), 3697e3711. https://doi.org/ 10.1002/hyp.6973. Chetan, M., Sudheer, K.P., 2006. A hybrid linear-neural model for river flow forecasting. Water Resour. Res. 42, W04402. https://doi.org/10.1029/2005WR004072. Chiang, Y., Chang, L., Chang, F., 2004. Comparison of static-feedforward and dynamic feedback neural networks for rainfall-runoff modeling. J. Hydrol. 290 (3e4), 297e311. https://doi.org/ 10.1016/j.jhydrol.2003.12.033. Chiu, C.L., 1972. Stochastic methods in hydraulics and hydrology of streamflow. Geophys. Surv. 1 (1), 61e84. https://doi.org/10.1007/BF01449551. Chu, H., Wei, J., Wu, W., 2020. Streamflow prediction using LASSO-FCM-DBN approach based on hydrometeorological condition classification. J. Hydrol. 580, 124253. https://doi.org/ 10.1016/j.jhydrol.2019.124253. Cigizoglu, H.K., 2005. Application of generalized regression neural networks to intermittent flow forecasting and estimation. J. Hydrol. Eng. ASCE 10 (4), 336e341. https://doi.org/10.1061/ (ASCE)1084-0699(2005)10:4(336). Clair, T.A., Ehrman, J.M., 1998. Using neural networks to assess the influence of changing seasonal climates in modifying discharge, dissolved organic carbon, and nitrogen export in eastern Canadian rivers. Water Resour. Res. 34 (3), 447e455. https://doi.org/10.1029/ 97WR03472. Coulibaly, P., Anctil, F., Bobee, B., 2000. Daily reservoir inflow forecasting using artificial neural networks with stopped training approach. J. Hydrol. 230, 244e257. https://doi.org/10.1016/ S0022-1694(00)00214-6. Coulibaly, P., Hache, M., Fortin, V., Bobee, B., 2005. Improving daily reservoir inflow forecasts with model combination. J. Hydrol. Eng. ASCE 10 (2), 91e99. https://doi.org/10.1061/ (ASCE)1084-0699(2005)10:2(91). Cryer, J.D., 1986. Time Series Analysis. Duxbury Press, Massachusetts. Cui, H., Singh, V.P., 2016. Maximum entropy spectral analysis for streamflow forecasting. Phys. Stat. Mech. Appl. 442, 91e99. https://doi.org/10.1016/j.physa.2015.08.060. Danandeh Mehr, A., 2018. An improved gene expression programming model for streamflow forecasting in intermittent streams. J. Hydrol. 563, 669e678. https://doi.org/10.1016/ j.jhydrol.2018.06.049.

36 Advances in Streamflow Forecasting Danandeh Mehr, A., Kahya, E., 2017. A Pareto-optimal moving average multigene genetic programming model for daily streamflow prediction. J. Hydrol. 549, 603e615. https://doi.org/ 10.1016/j.jhydrol.2017.04.045. Danandeh Mehr, A., Kahya, E., Bagheri, F., Deliktas, E., 2013. Successive-station monthly streamflow prediction using neuro-wavelet technique. Earth Sci. India 7, 217e229. https:// doi.org/10.1007/s12145-013-0141-3. Danandeh Mehr, A., Kahya, E., Bagheri, F., Deliktas, E., 2014. Successive-station monthly streamflow prediction using neuro-wavelet technique. Earth Sci. India 7 (4), 217e229. Danandeh Mehr, A., Kahya, E., Sahin, A., Nazemosadat, M.J., 2015. Successive-station monthly streamflow prediction using different artificial neural network algorithms. Int. J. Environ. Sci. Technol. 12 (7), 2191e2200. https://doi.org/10.1007/s13762-014-0613-0. Danandeh Mehr, A., Kahya, E., Yerdelen, C., 2014. Linear genetic programming application for successive-station monthly streamflow prediction. Comput. Geosci. 70, 63e72. https://doi.org/ 10.1016/j.cageo.2014.04.015. Danandeh Mehr, A., Nourani, V., 2018. Season algorithm-multigene genetic programming: a new approach for rainfall-runoff modelling. Water Resour. Manag. 32 (8), 2665e2679. https:// doi.org/10.1007/s11269-018-1951-3. Dariane, A.B., Azimi, S., 2018. Streamflow forecasting by combining neural networks and fuzzy models using advanced methods of input variable selection. J. Hydroinf. 20 (2), 520e532. https://doi.org/10.2166/HYDRO.2017.076. Dariane, A.B., Farhani, M., Azimi, S., 2018. Long term streamflow forecasting using a hybrid entropy model. Water Resour. Manag. 32 (4), 1439e1451. https://doi.org/10.1007/s11269017-1878-0. Davie, T., 2008. Fundamentals of Hydrology, second ed. Routledge, Taylor & Francis Group, London, p. 200. https://doi.org/10.4324/9780203933664. Dawson, C.W., Wilby, R.L., 2001. Hydrological modelling using artificial neural networks. Prog. Phys. Geogr. 25 (1), 80e108. https://doi.org/10.1177/030913330102500104. Dehghani, R., Poudeh, H.T., Younesi, H., Shahinejad, B., 2020. Daily streamflow prediction using support vector machine-artificial flora (SVM-AF) hybrid model. Acta Geophys. 68 (6), 1763e1778. https://doi.org/10.1007/s11600-020-00472-7. Deo, R.C., Sahin, M., 2016. An extreme learning machine model for the simulation of monthly mean streamflow water level in eastern Queensland. Environ. Monit. Assess. 188 (2) https:// doi.org/10.1007/s10661-016-5094-9. Devia, G.K., Ganasri, B.P., Dwarakish, G.S., 2015. A review on hydrological models. Aquatic Proc. 4, 1001e1007. https://doi.org/10.1016/j.aqpro.2015.02.126. Dietrich, W.E., 1987. Mechanics of flow and sediment transport in river bends. River Channel. 87 (3), 179e227. https://doi.org/10.1029/WM012. Dobriyal, P., Badola, R., Tuboi, C., Hussain, S.A., 2017. A review of methods for monitoring streamflow for sustainable water resource management. Appl. Water Sci. 7, 2617e2628. https://doi.org/10.1007/s13201-016-0488-y. Dorado, J., Rabun˜AL, J.R., Pazos, A., Rivero, D., Santos, A., Puertas, J., 2003. Prediction and modeling of the rainfall-runoff transformation of a typical urban basin using ANN and GP. Appl. Artif. Intell. 17 (4), 329e343. https://doi.org/10.1080/713827142. Dorigo, M., Maniezzo, V., Colorni, A., 1996. The ant systems: optimization by a colony of cooperative agents. IEEE Trans. Man, Mach. Cybern. B 26. https://doi.org/10.1109/ 3477.484436. El-Shafie, A., Taha, M.R., Noureldin, A., 2007. A neuro-fuzzy model for inflow forecasting of the Nile river at Aswan high dam. Water Resour. Manag. 21, 533e556. https://doi.org/10.1007/ S11269-006-9027-1.

Streamflow forecasting Chapter | 1

37

Elganiny, M.A., Eldwer, A.E., 2018. Enhancing the forecasting of monthly streamflow in the main key stations of the river nile basin. Water Resour. 45 (5), 660e671. https://doi.org/10.1134/ S0097807818050135. Elshorbagy, A., Simonovic, S.P., Panu, U.S., 2002. Estimation of missing stream flow data using principles of chaos theory. J. Hydrol. 255, 123e133. https://doi.org/10.1016/S0022-1694(01) 00513-3. Fahimi, F., Yaseen, Z.M., El-shafie, A., 2017. Application of soft computing based hybrid models in hydrological variables modeling: a comprehensive review. Theor. Appl. Climatol. 128 (3e4), 875e903. https://doi.org/10.1007/s00704-016-1735-8. Fahlman, S.E., Lebiere, C., 1990. The Cascaded-Correlation Learning architecture. Report. CMUCS-90-100. Carnegie Mellon University, Pittsburgh, Pa School of Computer Science. Farmer, W.H., 2016. Ordinary kriging as a tool to estimate historical daily streamflow records. Hydrol. Earth Syst. Sci. 20, 2721e2735. https://doi.org/10.5194/hess-20-2721-2016. Fathian, F., Fard, A.F., Ouarda, T.B., Dinpashoh, Y., Nadoushani, S.M., 2019. Modeling streamflow time series using nonlinear SETAR-GARCH models. J. Hydrol. 573, 82e97. https:// doi.org/10.1007/s00704-016-1735-8. Fathian, F., Mehdizadeh, S., Sales, A.K., Safari, M.J.S., 2019. Hybrid models to improve the monthly river flow prediction: integrating artificial intelligence and non-linear time series models. J. Hydrol. 575, 1200e1213. https://doi.org/10.1016/J.JHYDROL.2019.06.025. Feng, Z.K., Niu, W.-J., Tang, Z.-Y., Jiang, Z.-Q., Xu, Y., Liu, Y., Zhang, H.R., 2020. Monthly runoff time series prediction by variational mode decomposition and support vector machine based on quantum-behaved particle swarm optimization. J. Hydrol. 583 (2020), 124627. https://doi.org/10.1016/j.jhydrol.2020.124627. Fernandez, B., Salas, J.D., 1990. Gamma-autoregressive models for stream-flow simulation. J. Hydraul. Eng. ASCE 116 (11), 1403e1414. https://doi.org/10.1061/(ASCE)0733-9429(1990) 116:11(1403). Firat, M., Gu¨ngo¨r, M., 2008. Hydrological time-series modelling using an adaptive neuro-fuzzy inference system. Hydrol. Process. 22 (13), 2122e2132. https://doi.org/10.1002/hyp.6812. Gao, P.P., Li, Y.P., Huang, G.H., Su, Y.Y., 2021. An integrated Bayesian least-squares-supportvector-machine factorial-analysis (B-LSVM-FA) method for inferring inflow from the Amu Darya to the Aral Sea under ensemble prediction. J. Hydrol. 594 (2021), 125909. https:// doi.org/10.1016/j.jhydrol.2020.125909. Gardner, E.S., 1985. Exponential smoothing: the state of the art. J. Forecast. 4 (1), 1e28. https:// doi.org/10.1002/for.3980040103. Garen, D.C., 1992. Improved techniques in regression-based streamflow volume forecasting. J. Water Resour. Plann. Manag. ASCE 118 (6), 654e670. https://doi.org/10.1061/(ASCE) 0733-9496(1992)118:6(654). Ghaith, M., Siam, A., Li, Z., El-Dakhakhni, W., 2020. Hybrid hydrological data-driven approach for daily streamflow forecasting. J. Hydrol. Eng. ASCE 25 (2), 04019063. https://doi.org/ 10.1061/(ASCE)HE.1943-5584.0001866. Ghorbani, M.A., Khatibi, R., Danandeh Mehr, A., Asadi, H., 2018. Chaos-based multigene genetic programming: a new hybrid strategy for river flow forecasting. J. Hydrol. 562, 455e467. https://doi.org/10.1016/j.jhydrol.2018.04.054. Grossmann, A., Morlet, J., 1984. Decomposition of hardy functions into square integrable wavelets of constant shape. SIAM J. Mathemat. Anal. 15, 723e736. https://doi.org/10.1137/0515056. Guimara˜es Santos, C.A., da Silva, G.B.L., 2014. Daily streamflow forecasting using a wavelet transform and artificial neural network hybrid models. Hydrol. Sci. J. 59 (2), 312e324. https:// doi.org/10.1080/02626667.2013.800944.

38 Advances in Streamflow Forecasting Guo, Y., Xu, Y.P., Yu, X., Chen, H., Gu, H., Xie, J., 2020. AI-based techniques for multi-step streamflow forecasts: application for multi-objective reservoir operation optimization and performance assessment. Hydrol. Earth Syst. Sci. Discuss. 1e52. https://doi.org/10.5194/hess2020-617. Guven, A., 2009. Linear genetic programming for time-series modelling of daily flow rate. J. Earth System. Sci. 118 (2), 137e146. https://doi.org/10.1007/s12040-009-0022-9. Hadi, S.J., Tombul, M., 2018. Monthly streamflow forecasting using continuous wavelet and multigene genetic programming combination. J. Hydrol. 561, 674e687. https://doi.org/10.1016/ J.JHYDROL.2018.04.036. Haltiner, J.P., Salas, J.D., 1988. Short-term forecasting of snowmelt discharge using ARMAX models. Water Resour. Bull. 24 (5), 1083e1089. https://doi.org/10.1111/j.17521688.1988.tb03025.x. Hannan, E.J., 1970. Multiple Time Series. Wiley, New York. Hao, Z., Singh, V.P., 2012. Entropy-copula method for single-site monthly streamflow simulation. Water Resour. Res. 48 (6) https://doi.org/10.1029/2011WR011419. Harms, A.A., Campbell, T.H., 1967. An extension to the Thomas-Fiering model for the sequential generation of streamflow. Water Resour. Res. 3 (3), 653e661. https://doi.org/10.1029/ WR003i003p00653. Haykin, S., 1994. Neural Networks: A Comprehensive Foundation. Macmillan College Publishing Company Inc. He, X., Luo, J., Zuo, G., Xie, J., 2019. Daily runoff forecasting using a hybrid model based on variational mode decomposition and deep neural networks. Water Resour. Manage. 33 (4), 1571e1590. https://doi.org/10.1007/s11269-019-2183-x. He, Z., Wen, X., Liu, H., Du, J., 2014. A comparative study of artificial neural network, adaptive neuro fuzzy inference system and support vector machine for forecasting river flow in the semiarid mountain region. J. Hydrol. 509, 379e386. https://doi.org/10.1016/ j.jhydrol.2013.11.054. Hejazi, M.I., Cai, X.M., 2009. Input variable selection for water resources systems using a modified minimum redundancy maximum relevance (mMRMR) algorithm. Adv. Water Resour. 32 (4), 582e593. https://doi.org/10.1016/j.advwatres.2009.01.009. Hipel, K.W., McLeod, A.I., McBean, E.A., 1977. Stochastic modelling of the effects of reservoir operation. J. Hydrol. 32 (1e2), 97e113. https://doi.org/10.1016/0022-1694(77)90121-4. Holland, J., 1975. Adaptation in Natural and Artificial Systems. Univ Michigan Press, Ann Arbor. Holt, C.C., 1957. Forecasting seasonals and trends by exponentially weighted averages. O.N.R. Memorandum 52/1957, Carnegie Institute of Technology. Reprinted with discussion in 2004. Int. J. Forecast. 20, 5e13. Honorato, A.G.S.M., da Silva, G.B.L., Santos, C.A.G., 2018. Monthly streamflow forecasting using neuro-wavelet techniques and input analysis. Hydrol. Sci. J. 63 (15e16), 2060e2075. https://doi.org/10.1080/02626667.2018.1552788. Hopfield, J.J., 1982. Neural networks and physical systems with emergent collective computational abilities. Proc., Nat. Academy Sci. 79, 2554e2558. https://doi.org/10.1073/pnas.79.8.2554. Hsu, K.L., Gupta, H.V., Sorooshian, S., 1995. Artificial neural network modeling of the rainfallrunoff process. Water Resour. Res. 31 (10), 2517e2530. https://doi.org/10.1029/95WR01955. Huang, S.Z., Chang, J.X., Huang, Q., Chen, Y.T., 2014. Monthly streamflow prediction using modified EMD-based support vector machine. J. Hydrol. 511, 764e775. https://doi.org/ 10.1016/J.JHYDROL.2014.01.062. Huang, W.R., Xu, B., Hilton, A., 2004. Forecasting flows in Apalachicola river using neural networks. Hydrol. Process. 18, 2545e2564. https://doi.org/10.1002/hyp.1492.

Streamflow forecasting Chapter | 1

39

Huo, Z., Feng, S., Kang, S., Huang, G., Wang, F., Guo, P., 2012. Integrated neural networks for monthly river flow estimation in arid inland basin of Northwest China. J. Hydrol. 420e421, 159e170. https://doi.org/10.1016/j.jhydrol.2011.11.054. Jacoby, S.L.S., 1966. A mathematical model for nonlinear hydrologic systems. J. Geophys. Res. 71 (20), 4811e4824. https://doi.org/10.1029/JZ071i020p04811. Jacquin, A.P., Shamseldin, A.Y., 2009. Review of the application of fuzzy inference systems in river flow forecasting. J. Hydroinf. 11 (3e4), 202e210. Jain, S.K., Das, A., Srivastava, D.K., 1999. Application of ANN for reservoir inflow prediction and operation. J. Water Resour. Plann. Manage. ASCE 125 (5), 263e271. https://doi.org/10.1061/ (ASCE)0733-9496(1999)125:5(263). Jain, A., Kumar, A.M., 2007. Hybrid neural network models for hydrologic time series forecasting. Appl. Soft Comput. 7 (2), 585e592. https://doi.org/10.1016/j.asoc.2006.03.002. Jang, J.-S.R., 1993. ANFIS: adaptive-network-based fuzzy inference systems. IEEE Trans. Syst., Man Cybernet. 23 (3), 665e685. https://doi.org/10.1109/21.256541. Jayawardena, A.W., Gurung, A.B., 2000. Noise reduction and prediction of hydro- meteorological time series: dynamical systems approach vs. stochastic approach. J. Hydrol. 228, 242e264. https://doi.org/10.1016/S0022-1694(00)00142-6. Jayawardena, A.W., Lai, F., 1994. Analysis and prediction of chaos in rainfall and stream flow time series. J. Hydrol. 153, 23e52. https://doi.org/10.1016/0022-1694(94)90185-6. Jimenez, C., McLeod, A.I., Hipel, K.W., 1989. Kalman filter estimation for periodic autoregressive-moving average models. Stoch. Hydrol. Hydraul. 3 (3), 227e240. https:// doi.org/10.1007/BF01543862. Jothiprakash, V., Magar, R.B., 2012. Multi-time-step ahead daily and hourly intermittent reservoir inflow prediction by artificial intelligent techniques using lumped and distributed data. J. Hydrol. 450 (451), 293e307. https://doi.org/10.1016/j.jhydrol.2012.04.045. Kalman, R.E., 1960. A new approach to linear filtering and prediction problems. J. Basic Eng. 82, 34e45. https://doi.org/10.1115/1.3662552. Kalteh, A.M., 2013. Monthly river flow forecasting using artificial neural network and support vector regression models coupled with wavelet transform. Comput. Geosci. 54, 1e8. https:// doi.org/10.1016/j.cageo.2012.11.015. Kalteh, A.M., 2015. Wavelet genetic algorithm-support vector regression (wavelet GA-SVR) for monthly flow forecasting. Water Resour. Manag. 29 (4), 1283e1293. https://doi.org/10.1007/ s11269-014-0873-y. Kalteh, A.M., Berndtsson, R., 2007. Interpolating monthly precipitation by self-organizing map (SOM) and multilayer perceptron (MLP). Hydrol. Sci. J. 52 (2), 305e317. https://doi.org/ 10.1623/hysj.52.2.305. Kambalimath, S., Deka, P.C., 2021. Performance enhancement of SVM model using discrete wavelet transform for daily streamflow forecasting. Environ. Earth Sci. 80 (3), 1e16. https:// doi.org/10.1007/s12665-021-09394-z. Karthikeyan, L., Kumar, D.N., 2013. Predictability of nonstationary time series using wavelet and EMD based ARMA models. J. Hydrol. 502, 103e119. https://doi.org/10.1016/ j.jhydrol.2013.08.030. Karunanithi, N., Grenney, W.J., Whitley, D., Bovee, K., 1994. Neural networks for river flow prediction. J. Comput. Civil Eng. ASCE 8 (2), 201e220. https://doi.org/10.1061/(ASCE)08873801(1994)8:2(201). Kasabov, N.K., 1996. Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering. MIT Press, Cambridge, Mass, p. 550.

40 Advances in Streamflow Forecasting Kashani, M.H., Ghorbani, M.A., Dinpashoh, Y., Shahmorad, S., 2016. Integration of Volterra model with artificial neural networks for rainfall-runoff simulation in forested catchment of northern Iran. J. Hydrol. 540, 340e354. https://doi.org/10.1016/j.jhydrol.2016.06.028. Kasiviswanathan, K.S., He, J., Sudheer, K.P., Tay, J.H., 2016. Potential application of wavelet neural network ensemble to forecast streamflow for flood management. J. Hydrol. 536, 161e173. https://doi.org/10.1016/j.jhydrol.2016.02.044. Kennedy, J., Eberhart, R., 1995. Particle swarm optimization. Neural Networks. In: Proceedings., IEEE Int. Conf. 4, vol. 4, pp. 1942e1948. https://doi.org/10.1109/ICNN.1995.488968, 1995. Kentel, E., 2009. Estimation of river flow by artificial neural networks and identification of input vectors susceptible to producing unreliable flow estimates. J. Hydrol. 375, 481e488. https:// doi.org/10.1016/j.jhydrol.2009.06.051. Kim, S.E., Seo, I.W., 2015. Artificial neural network ensemble modeling with exploratory factor analysis for streamflow forecasting. J. Hydroinf. 17 (4), 614e639. https://doi.org/10.2166/ HYDRO.2015.033. ¨ ., 2003. River flow modeling using artificial neural networks. J. Hydrol. Eng. ASCE 9 (1), Kis¸i, O 60e63. https://doi.org/10.1061/(ASCE)1084-0699(2004)9:1(60). ¨ ., 2005. Daily river flow forecasting using artificial neural networks and auto-regressive Kis¸i, O models. Turk. J. Eng. Environ. Sci. 29 (1), 9e20. ¨ ., 2007. Streamflow forecasting using different artificial neural network algorithms. Kis¸i, O J. Hydrol. Eng. ASCE 12 (5), 532e539. https://doi.org/10.1061/(ASCE)1084-0699(2007) 12:5(532). ¨ ., 2008. River flow forecasting and estimation using different artificial neural network Kis¸i, O techniques. Nord. Hydrol. 39 (1), 27e40. https://doi.org/10.2166/nh.2008.026. ¨ ., 2008b. Stream flow forecasting using neuro-wavelet technique. Hydrol. Process. 22, Kis¸i, O 4142e4152. https://doi.org/10.1002/hyp.7014. ¨ ., 2010. Wavelet regression model for short-term streamflow forecasting. J. Hydrol. 389 Kis¸i, O (3e4), 344e353. https://doi.org/10.1016/j.jhydrol.2010.06.013. ¨ ., Nia, A.M., Gosheh, M.G., Tajabadi, M.R.J., Ahmadi, A., 2012. Intermittent streamflow Kis¸i, O forecasting by using several data driven techniques. Water Resour. Manage. 26 (2), 457e474. https://doi.org/10.1007/s11269-011-9926-7. Kohonen, T., 1988. An introduction to neural computing. Neural Netw. 1 (1), 3e16. https://doi.org/ 10.1016/0893-6080(88)90020-2. Komornı´k, J., Komornı´kova´, M., Mesiar, R., Szo¨keova´, D., Szolgay, J., 2006. Comparison of forecasting performance of nonlinear models of hydrological time series. Phys. Chem. Earth, Parts A/B/C 31 (18), 1127e1145. https://doi.org/10.1016/j.pce.2006.05.006. Kothyari, U.C., Singh, V.P., 1999. A multiple-input single-output model for flow forecasting. J. Hydrol. 220, 12e26. https://doi.org/10.1016/S0022-1694(99)00055-4. Koza, J.R., 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press p 819. Krstanovic, P.F., Singh, V.P., 1991. A univariate model for long-term streamflow forecasting, 1. development. Stoch. Hydrol. Hydraul. 5 (3), 173e188. https://doi.org/10.1007/BF01544057. Krstanovic, P.F., Singh, V.P., 1991. A univariate model for long-term streamflow forecasting, 2. application. Stoch. Hydrol. Hydraul. 5 (3), 189e205. https://doi.org/10.1007/BF01544057. Kumar, D.N., Raju, K.S., Sathish, T., 2004. River flow forecasting using recurrent neural networks. Water Resour. Manage. 18 (2), 143e161. https://doi.org/10.1023/ B:WARM.0000024727.94701.12. Lall, U., Sharma, A., 1996. A nearest neighbor bootstrap for resampling hydrologic time series. Water Resour. Res. 32 (3), 679e693. https://doi.org/10.1029/95WR02966.

Streamflow forecasting Chapter | 1

41

Le, X.H., Ho, H.V., Lee, G., Jung, S., 2019. Application of long short-term memory (LSTM) neural network for flood forecasting. Water 11 (7), 1387. https://doi.org/10.3390/w11071387. Ledolter, J., 1978. A general class of stochastic models for hydrologic sequences. J. Hydrol. 36 (3e4), 309e325. https://doi.org/10.1016/0022-1694(78)90151-8. Li, F.F., Wang, Z.Y., Qiu, J., 2018. Long-term streamflow forecasting using artificial neural network based on preprocessing technique. J. Forecast. 38 (3), 192e206. https://doi.org/ 10.1002/for.2564. Li, F.-F., Wang, Z.-Y., Zhao, X., Xie, E., Qiu, J., 2019. Decomposition-ANN methods for longterm discharge prediction based on fisher’s ordered clustering with MESA. Water Resour. Manag. 33, 3095e3110. https://doi.org/10.1007/s11269-019-02295-8. Lima, A.R., Cannon, A.J., Hsieh, W.W., 2016. Forecasting daily streamflow using online sequential extreme learning machines. J. Hydrol. 537, 431e443. https://doi.org/10.1016/ j.jhydrol.2016.03.017. Lin, J.Y., Cheng, C.T., Chau, K.W., 2006. Using support vector machines for long-term discharge prediction. Hydrol. Sci. J. 51 (4), 599e612. https://doi.org/10.1623/hysj.51.4.599. Liu, Y., Brown, J., Demargne, J., Seo, D.J., 2011. A wavelet-based approach to assessing timing errors in hydrologic predictions. J. Hydrol. 397 (3e4), 210e224. https://doi.org/10.1016/ j.jhydrol.2010.11.040. Liu, Y., Sang, Y.F., Li, X., Hu, J., Liang, K., 2017. Long-term streamflow forecasting based on relevance vector machine model. Water 9 (1). https://doi.org/10.3390/w9010009. Liu, Y., Ye, L., Qin, H., Hong, X., Ye, J., Yin, X., 2018. Monthly streamflow forecasting based on hidden Markov model and Gaussian mixture regression. J. Hydrol. 561, 146e159. https:// doi.org/10.1016/j.jhydrol.2018.03.057. Liu, Z., Zhou, P., Chen, G., Guo, L., 2014. Evaluating a coupled discrete wavelet transform and support vector regression for daily and monthly streamflow forecasting. J. Hydrol. 519, 2822e2831. https://doi.org/10.1016/j.jhydrol.2014.06.050. Luo, X., Yuan, X., Zhu, S., Xu, Z., Meng, L., Peng, J., 2019. A hybrid support vector regression framework for streamflow forecast. J. Hydrol. 568, 184e193. https://doi.org/10.1016/ j.jhydrol.2018.10.064. Maheswaran, R., Khosa, R., 2012. Wavelet-Volterra coupled model for monthly stream flow forecasting. J. Hydrol. 450e451, 320e335. https://doi.org/10.1016/j.jhydrol.2012.04.017. Maier, H.R., Kapelan, Z., Kasprzyk, J., Kollat, J., Matott, L.S., Cunha, M.C., Dandy, G.C., Gibbs, M.S., Keedwell, E., Marchi, A., Ostfeld, A., Savic, D., Solomatine, D.P., Vrugt, J.A., Zecchin, A.C., Minsker, B.S., Barbour, E.J., Kuczera, G., Pasha, F., Castelletti, A., Giuliani, M., Reed, P.M., 2014. Evolutionary algorithms and other metaheuristics in water resources: current status, research challenges and future directions. Environ. Model. Software 62, 271e299. https://doi.org/10.1016/j.envsoft.2014.09.013. Malik, A., Tikhamarine, Y., Souag-Gamane, D., Kis¸i, O., Pham, Q.B., 2020. Support vector regression optimized by meta-heuristic algorithms for daily streamflow prediction. Stoch. Environ. Res. Risk Assess. 34 (11), 1755e1773. https://doi.org/10.1007/s00477-020-01874-1. Maria, C.M., Wenceslao, G.M., Manuel, F.B., Jose´, M.P.S., Roma´n, L.C., 2004. Modelling of the monthly and daily behaviour of the discharge of the Xallasriver using BoxeJenkins and neural networks methods. J. Hydrol. 296, 38e58. https://doi.org/10.1016/J.JHYDROL.2004.03.011. Maslova, I., Ticlavilca, A.M., McKee, M., 2016. Adjusting wavelet-based multiresolution analysis boundary conditions for long-term streamflow forecasting. Hydrol. Process. 30, 57e74. https://doi.org/10.1002/hyp.10564. Matos, J.P., Portela, M.M., Schleiss, A.J., 2018. Towards safer data-driven forecasting of extreme streamflows. Water Resour. Manage. 32 (2), 701e720. https://doi.org/10.1007/ s11269-017-1834-z.

42 Advances in Streamflow Forecasting McCullagh, P., Nelder, J.A., 1989. Generalized Linear Models, 2nd. ed. CRC press, p. 532. McCulloch, W.S., Pitts, W., 1943. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5 (4), 115e133. https://doi.org/10.1007/BF02478259. McKerchar, A.I., Delleur, J.W., 1974. Application of seasonal parametric linear stochastic models to monthly flow data. Water Resour. Res. 10 (2), 246e255. https://doi.org/10.1029/ WR010i002p00246. McLeod, A.I., Hipel, K.W., Lennox, W.C., 1977. Advances in box-jenkins modeling: 2. applications. Water Resour. Res. 13 (3), 577e586. https://doi.org/10.1029/WR013i003p00577. Mehdizadeh, S., Fathian, F., Adamowski, J.F., 2019. Hybrid artificial intelligence-time series models for monthly streamflow modeling. Appl. Soft Comput. 80, 873e887. https://doi.org/ 10.1016/j.asoc.2019.03.046. Mehdizadeh, S., Sales, A.K., 2018. A comparative study of autoregressive, autoregressive moving average, gene expression programming and bayesian networks for estimating monthly streamflow. Water Resour. Manage. 32 (9), 3001e3022. https://doi.org/10.1007/s11269-018-1970-0. Mehran, A., Mazdiyasni, O., AghaKouchak, A., 2015. A hybrid framework for assessing socioeconomic drought: linking climate variability, local resilience, and demand. J. Geophys. Res.: Atmosphere 120 (15), 7520e7533. https://doi.org/10.1002/2015JD023147. Meshgi, A., Schmitter, P., Chui, T.F.M., Babovic, V., 2015. Development of a modular streamflow model to quantify runoff contributions from different land uses in tropical urban environments using genetic programming. J. Hydrol. 525, 711e723. https://doi.org/ 10.1016/j.jhydrol.2015.04.032. Minns, A.W., Hall, M.J., 1996. Artificial neural networks as rainfall-runoff models. Hydrol. Sci. J. 41 (3), 399e417. https://doi.org/10.1080/02626669609491511. Modarres, R., Ouarda, T.B., 2013. Modeling rainfallerunoff relationship using multivariate GARCH model. J. Hydrol. 499, 1e18. https://doi.org/10.1016/j.jhydrol.2013.06.044. Moeeni, H., Bonakdari, H., 2017. Forecasting monthly inflow with extreme seasonal variation using the hybrid SARIMA-ANN model. Stoch. Environ. Res. Risk Assess. 31, 1997e2010. https://doi.org/10.1007/s00477-016-1273-z. Moeeni, H., Bonakdari, H., Ebtehaj, I., 2017. Monthly reservoir inflow forecasting using a new hybrid SARIMA genetic programming approach. J. Earth Syst. Sci. 126 (2), 18. https:// doi.org/10.1007/s12040-017-0798-y. Moeeni, H., Bonakdari, H., Fatemi, S.E., 2017. Stochastic model stationarization by eliminating the periodic term and its effect on time series prediction. J. Hydrol. 547, 348e364. https:// doi.org/10.1016/j.jhydrol.2017.02.012. Mohammadi, B., Linh, N.T.T., Pham, Q.B., Ahmed, A.N., Vojtekova´, J., Guan, Y., Abba, S.I., ElShafie, A., 2020. Adaptive neuro-fuzzy inference system coupled with shuffled frog leaping algorithm for predicting river streamflow time series. Hydrol. Sci. J. 65 (10), 1738e1751. https://doi.org/10.1080/02626667.2020.1758703. Mohan, S., Vedula, S., 1995. Multiplicative seasonal ARIMA model for longterm forecasting of inflows. Water Resour. Manage. 9 (2), 115e126. https://doi.org/10.1007/BF00872463. Montanari, A., Rosso, R., Taqqu, M.S., 1997. Fractionally differenced ARIMA models applied to hydrologic time series: identification, estimation, and simulation. Water Resour. Res. 33 (5), 1035e1044. https://doi.org/10.1029/97WR00043. Montanari, A., Taqqu, M.S., Teverovsky, V., 1999. Estimating long-range dependence in the presence of periodicity: an empirical study. Math. Comput. Model. 29 (10e12), 217e228. https://doi.org/10.1016/S0895-7177(99)00104-1. Moradkhani, H., Hsu, K., Gupta, H.V., Sorooshian, S., 2004. Improved streamflow forecasting using self-organizing radial basis function artificial neural networks. J. Hydrol. 295, 246e262. https://doi.org/10.1016/j.jhydrol.2004.03.027.

Streamflow forecasting Chapter | 1

43

Mosley, M.P., McKerchar, A.I., 1993. Streamflow, chapter 8. In: Maidment, D.R. (Ed.), In Chief), Handbook of Hydrology. McGraw-Hill, Inc., New York, pp. 8.1e8.35. Mujumdar, P.P., Kumar, D.N., 1990. Stochastic models of streamflow: some case studies. Hydrol. Sci. J. 35 (4), 395e410. https://doi.org/10.1080/02626669009492442. Mukerji, A., Chatterjee, C., Raghuwanshi, N.S., 2009. Flood forecasting using ANN, neuro-fuzzy, and neuro-GA models. J. Hydrol. Eng. ASCE 14 (6), 647e652. https://doi.org/10.1061/ (ASCE)HE.1943-5584.0000040. Mutlu, E., Chaubey, I., Hexmoor, H., 2008. Comparison of artificial neural network models for hydrologic predictions at multiple gauging stations in an agricultural watershed. Hydrol. Process. 22 (26), 5097e5106. https://doi.org/10.1002/hyp.7136. Nalley, D., Adamowski, J., Khalil, B., Biswas, A., 2020. A comparison of conventional and wavelet transform based methods for streamflow record extension. J. Hydrol. 582, 124503. https://doi.org/10.1016/j.jhydrol.2019.124503 (2020). Nanda, T., Sahoo, B., Beria, H., Chatterjee, C., 2016. A wavelet-based non-linear autoregressive with exogenous inputs (WNARX) dynamic neural network model for real-time flood forecasting using satellite-based rainfall products. J. Hydrol. 539, 57e73. https://doi.org/10.1016/ j.jhydrol.2016.05.014. Nanda, T., Sahoo, B., Chatterjee, C., 2019. Enhancing real-time streamflow forecasts with waveletneural network based error-updating schemes and ECMWF meteorological predictions in variable infiltration capacity model. J. Hydrol. 575, 890e910. https://doi.org/10.1016/ j.jhydrol.2019.05.051. Nayak, P.C., Sudheer, K.P., Ramasastri, K.S., 2005. Fuzzy computing based rainfall-runoff model for real time flood forecasting. Hydrol. Process. 19, 955e958. https://doi.org/10.1002/hyp.5553. Nayak, P.C., Sudheer, K.P., Rangan, D.M., Ramasastri, K.S., 2004. A neuro-fuzzy computing technique for modeling hydrological time series. J. Hydrol. 291, 52e66. https://doi.org/ 10.1016/j.jhydrol.2003.12.010. Nayak, P.C., Sudheer, K.P., Rangan, D.M., Ramasastri, K.S., 2005. Short-term flood forecasting with a neurofuzzy model. Water Resour. Res. 41, W04004. https://doi.org/10.1029/ 2004WR003562. Ni, L., Wang, D., Wu, J., Wang, Y., Taoa, Y., Zhang, J., Liu, J., 2020. Streamflow forecasting using extreme gradient boosting model coupled with Gaussian mixture model. J. Hydrol. 586, 124901. https://doi.org/10.1016/j.jhydrol.2020.124901. Nigam, R., Nigam, S., Mittal, S.K., 2014. Stochastic modeling of rainfall and runoff phenomenon: a time series approach review. Int. J. Hortic. Sci. Technol. 4 (2), 81e109. https://doi.org/ 10.1504/IJHST.2014.066437. Niu, W.J., Feng, Z.K., Feng, B.F., Xu, Y.S., Min, Y.W., 2021. Parallel computing and swarm intelligence based artificial intelligence model for multi-step-ahead hydrological time series prediction. Sustain. Cities Soci. 66, 102686. https://doi.org/10.1016/j.scs.2020.102686. Noakes, D.J., Hipel, K.W., McLeod, A.I., Jimenez, C., Yakowitz, S., 1988. Forecasting annual geophysical time series. Int. J. Forecast. 4 (1), 103e115. https://doi.org/10.1016/01692070(88)90012-X. Noakes, D.J., McLeod, A.I., Hipel, K.W., 1985. Forecasting monthly riverflow time series. Int. J. Forecast. 1 (2), 179e190. https://doi.org/10.1016/0169-2070(85)90022-6. Nourani, V., Andalib, G., Sadikoglu, F., 2017. Multi-station streamflow forecasting using wavelet denoising and artificial intelligence models. Proc. Comput. Sci. 120, 617e624. https://doi.org/ 10.1016/j.procs.2017.11.287. Nourani, V., Baghanam, A.H., Adamowski, J., Kis¸i, O., 2014. Applications of hybrid wavelete artificial intelligence models in hydrology: a review. J. Hydrol. 514, 358e377. https://doi.org/ 10.1016/j.jhydrol.2014.03.057.

44 Advances in Streamflow Forecasting Oki, T., Kanae, S., 2006. Global hydrological cycles and world water resources. Science 313 (5790), 1068e1072. https://doi.org/10.1126/science.1128845. Okkan, U., Serbes, Z.A., 2013. The combined use of wavelet transform and black box models in reservoir inflow modeling. J. Hydrol. Hydromech. 61 (2), 112e119. https://doi.org/10.2478/ johh-2013-0015. Ooms, M., Franses, P.H., 2001. A seasonal periodic long memory model for monthly river flows. Environ. Model. Software 16 (6), 559e569. https://doi.org/10.1016/S1364-8152(01)00025-1. Osman, A., Allawi, M.F., Afan, H.A., Noureldin, A., El-shafie, A., 2016. Acclimatizing fast orthogonal search (FOS) model for River stream-flow forecasting. Hydrol. Earth Syst. Sci. Discuss. 1e28. https://doi.org/10.5194/hess-2016-347. Papacharalampous, G., Tyralis, H., 2020. Hydrological time series forecasting using simple combinations: big data testing and investigations on one-year ahead river flow predictability. J. Hydrol. 590, 125205. https://doi.org/10.1016/j.jhydrol.2020.125205. Partal, T., 2009. River flow forecasting using different artificial neural network algorithms and wavelet transform. Can. J. Civ. Eng. 36 (1), 26e38. https://doi.org/10.1139/L08-090. Pham, Q.B., Afan, H.A., Mohammadi, B., Ahmed, A.N., Linh, N.T.T., Vo, N.D., Moazenzadeh, R., Yu, P.-S., El-Shafie, A., 2020. Hybrid model to improve the river streamflow forecasting utilizing multi-layer perceptron-based intelligent water drop optimization algorithm. Soft Comput. https://doi.org/10.1007/s00500-020-05058-5. Prada-Sarmiento, F., Obrego´n-Neira, N., 2009. Forecasting of monthly streamflows based on artificial neural networks. J. Hydrol. Eng. ASCE 14, 1390e1395. https://doi.org/10.1061/ (ASCE)1084-0699(2009)14:12(1390). Prairie, J.R., Rajagopalan, B., Fulp, T.J., Zagona, E.A., 2006. Modified K-NN model for stochastic streamflow simulation. J. Hydrol. Eng. ASCE 11 (4), 371e378. https://doi.org/10.1061/ (ASCE)1084-0699(2006)11:4(371). Pramanik, N., Panda, R.K., 2009. Application of neural network and adaptive neuro-fuzzy inference systems for stream flow prediction. Hydrol. Sci. J. 54 (2), 247e260. https://doi.org/ 10.1623/hysj.54.2.247. Pramanik, N., Panda, R.K., Singh, A., 2011. Daily river flow forecasting using wavelet ANN hybrid models. J. Hydroinf. 13 (1), 49e63. https://doi.org/10.2166/hydro.2010.040. Pulido-Calvo, I., Portela, M.M., 2007. Application of neural approaches to one-step daily flow forecasting in Portuguese watersheds. J. Hydrol. 332, 1e15. https://doi.org/10.1016/ j.jhydrol.2006.06.015. Raman, H., Sunilkumar, N., 1995. Multivariate modelling of water resources time series using artificial neural networks. Hydrol. Sci. J. 40 (2), 145e163. https://doi.org/10.1080/ 02626669509491401. Rasmussen, P.F., Salas, J.D., Fagherazzi, L., Rassam, J.C., Bobe´e, B., 1996. Estimation and validation of contemporaneous PARMA models for streamflow simulation. Water Resour. Res. 32 (10), 3151e3160. https://doi.org/10.1029/96WR01528. Rathinasamy, M., Adamowski, J., Khosa, R., 2013. Multiscale streamflow forecasting using a new Bayesian model average based ensemble multi-wavelet volterra nonlinear method. J. Hydrol. 507, 186e200. https://doi.org/10.1016/j.jhydrol.2013.09.025. ¨ ., 2017. Wavelet-linear genetic programming: a new approach Ravansalar, M., Rajaee, T., Kis¸i, O for modeling monthly streamflow. J. Hydrol. 549, 461e475. https://doi.org/10.1016/ j.jhydrol.2017.04.018. Riebsame, W.E., Changnon, S.A., Karl, T.R., 1991. Drought and Natural Resource Management in the United States: Impacts and Implications of the 1987e89 Drought. Westview Press, Boulder, CO, p. 174.

Streamflow forecasting Chapter | 1

45

Rieu, S.Y., Kim, Y.O., Lee, D.R., 2003. Streamflow Generation Using a Multivariate Hybrid Time Series Model. International Association of Hydrological Sciences, Publication, pp. 255e259. Saad, E.W., Prokhorov, D.V., Wunsch, D.C., 1996. Advanced neural network training methods for low false alarm stock trend prediction. In: Proceedings of International Conference on Neural Networks (ICNN’96), 4. IEEE, pp. 2021e2026. Said, S.E., Dickey, D.A., 1984. Testing for unit roots in autoregressive-moving average models of unknown order. Biometrika 71 (3), 599e607. https://doi.org/10.2307/2336570. Salas, J.D., Boes, D.C., Smith, R.A., 1982. Estimation of ARMA models with seasonal parameters. Water Resour. Res. 18 (4), 1006e1010. https://doi.org/10.1029/WR018i004p01006. Salas, J.D., Delleur, J.W., Yevjevich, V., Lane, W.L., 1980. Applied Modeling of Hydrologic Time Series. Water Resources Publications, Littleton, Colorado, p. 484 (2nd Printing 1985, 3rd Printing, 1988) . Salas, J.D., Markus, M., Tokar, A.S., 2000. Streamflow forecasting based on artificial neural networks. In: Govindaraju, R.S., Ramachandra Rao, A. (Eds.), Artificial Neural Networks in Hydrology. Kluwer Academic Publishers, pp. 23e51. Salas, J.D., Tabios III, G.Q., Bartolini, P., 1985. Approaches to multivariate modeling of water resources time series. Water Resour. Bull. 21 (4), 683e708. https://doi.org/10.1111/j.17521688.1985.tb05383.x. Samadianfard, S., Jarhan, S., Salwana, E., Mosavi, A., Shamshirband, S., Akib, S., 2019. Support vector regression integrated with fruit fly optimization algorithm for river flow forecasting in Lake Urmia Basin. Water 11 (9), 1934. https://doi.org/10.3390/w11091934. Samsudin, R., Saad, P., Shabri, A., 2011. River flow time series using least squares support vector machines. Hydrol. Earth Syst. Sci. 15 (6), 1835e1852. https://doi.org/10.5194/hess15-1835-2011. Sanikhani, H., Kis¸i, O., 2012. River flow estimation and forecasting by using two different adaptive neuro-fuzzy approaches. Water Resour. Manag. 26 (6), 1715e1729. https://doi.org/ 10.1007/s11269-012-9982-7. Sattari, M.T., Yurekli, K., Pal, M., 2012. Performance evaluation of artificial neural network approaches in forecasting reservoir inflow. Appl. Math. Model. 36 (6), 2649e2657. https:// doi.org/10.1007/s11269-012-9982-7. Schewe, J., Heinke, J., Gerten, D., Haddeland, I., Arnell, N.W., Clark, D.B., Dankers, R., Eisner, S., Fekete, B.M., Colo´n-Gonza´lez, F.J., Gosling, S.N., 2014. Multimodel assessment of water scarcity under climate change. Proc. Natl. Acad. Sci. U. S A. 111 (9), 3245e3250. https://doi.org/10.1073/pnas.1222460110. Schwefel, H.P., 1981. Numerical Optimization of Computer Models. John Wiley & Sons, Inc. See, L., Openshaw, S., 2000. A hybrid multi-model approach to river level forecasting. Hydrol. Sci. J. 45 (4), 523e536. https://doi.org/10.1080/02626660009492354. Shabri, A., Suhartono, 2012. Streamflow forecasting using least-squares support vector machines. Hydrol. Sci. J. 57 (7), 1275e1293. https://doi.org/10.1080/02626667.2012.714468. Shamseldin, A.Y., O’Connor, K.M., Liang, G.C., 1997. Methods for combining the outputs of different rainfallerunoff models. J. Hydrol. 197 (1e4), 203e229. https://doi.org/10.1016/ S0022-1694(96)03259-3. Shao, Q., Wong, H., Li, M., Ip, W.C., 2009. Streamflow forecasting using functional-coefficient time series model with periodic variation. J. Hydrol. 368 (1e4), 88e95. https://doi.org/ 10.1016/j.jhydrol.2009.01.029. Sharma, P., Bhakar, S.R., Ali, S., Jain, H.K., Singh, P.K., Kothari, M., 2018. Generation of synthetic streamflow of jakham river, Rajasthan using thomas-fiering model. J. Agric. Eng. 55 (4), 47e56.

46 Advances in Streamflow Forecasting Sharma, A., O’Neill, R., 2002. A nonparametric approach for representing interannual dependence in monthly streamflow sequences. Water Resour. Res. 38 (7), 5e1e5e10. https://doi.org/ 10.1029/2001WR000953. Sharma, S.K., Tiwari, K.N., 2009. Bootstrap based artificial neural network (BANN) analysis for hierarchical prediction of monthly runoff in Upper Damodar Valley Catchment. J. Hydrol. 374 (3e4), 209e222. https://doi.org/10.1016/j.jhydrol.2009.06.003. ¨ ., 2010. Short-term and long-term streamflow forecasting using a wavelet and Shiri, J., Kis¸i, O neuro-fuzzy conjunction model. J. Hydrol. 394, 486e493. https://doi.org/10.1016/ j.jhydrol.2010.10.008. Shoaib, M., Shamseldin, A.Y., Melville, B.W., Khan, M.M., 2015. Runoff forecasting using hybrid wavelet gene expression programming (WGEP) approach. J. Hydrol. 527, 326e344. https:// doi.org/10.1016/j.jhydrol.2015.04.072. Sim, C.H., 1987. A mixed gamma ARMA (1, 1) model for river flow time series. Water Resour. Res. 23 (1), 32e36. https://doi.org/10.1029/WR023i001p00032. Sivapragasam, C., Liong, S.Y., 2005. Flow categorization model for improving forecasting. Nord. Hydrol. 36 (1), 37e48. https://doi.org/10.2166/nh.2005.0004. Smith, J., Eli, R.N., 1995. Neural-network models of rainfall-runoff process. J. Water Resour. Plann. Manag. ASCE 121 (6), 499e508. https://doi.org/10.1061/(ASCE)0733-9496(1995) 121:6(499). Solomatine, D.P., Dulal, K.N., 2003. Model trees as an alternative to neural networks in rainfalldrunoff modelling. Hydrol. Sci. J. 48 (3), 399e411. https://doi.org/10.1623/ hysj.48.3.399.4529. Solomatine, D., See, L.M., Abrahart, R.J., 2009. Data-driven modelling: concepts, approaches and experiences. In: Abrahart, R.J., et al. (Eds.), Practical Hydroinformatics. Water Science and Technology Library 68. Springer-Verlag Berlin Heidelberg, pp. 17e30. Solomotine, D.P., Ostfeld, A., 2008. Data-driven modelling: some past experiences and new approaches. J. Hydroinf. 10 (1), 3e22. https://doi.org/10.2166/hydro.2008.015. Srinivas, V.V., Srinivasan, K., 2000. Post-blackening approach for modeling dependent annual streamflows. J. Hydrol. 230 (1e2), 86e126. https://doi.org/10.1016/S0022-1694(00)00168-2. Srinivas, V.V., Srinivasan, K., 2001. A hybrid stochastic model for multiseason streamflow simulation. Water Resour. Res. 37 (10), 2537e2549. https://doi.org/10.1029/2000WR900383. Stojkovic, M., Kostic, S., Plavsic, J., Prohaska, S., 2017. A joint stochastic-deterministic approach for long-term and short-term modelling of monthly flow rates. J. Hydrol. 544, 555e566. https://doi.org/10.1016/j.jhydrol.2016.11.025. Sudheer, K.P., Srinivasan, K., Neelakantan, T.R., Srinivas, V.V., 2008. A nonlinear data-driven model for synthetic generation of annual streamflows. Hydrol. Process. 22, 1831e1845. https://doi.org/10.1002/hyp.6764. Sun, A.Y., Wang, D., Xu, X., 2014. Monthly streamflow forecasting using Gaussian process regression. J. Hydrol. 511, 72e81. https://doi.org/10.1016/j.jhydrol.2014.01.023. Talaee, P.H., 2014. Multilayer perceptron with different training algorithms for streamflow forecasting. Neural Comput. Appl. 24 (3e4), 695e703. https://doi.org/10.1007/s00521-012-1287-5. Tan, Q.F., Lei, X.H., Wang, X., Wang, H., Wen, X., Ji, Y., Kang, A.Q., 2018. An adaptive middle and long-term runoff forecast model using EEMDANN hybrid approach. J. Hydrol. 567, 767e780. https://doi.org/10.1016/j.jhydrol.2018.01.015. Sen, Z., 1978. A mathematical model of monthly flow sequences. Hydrol. Sci. J. 23 (2), 223e229. ¨ ., Ergin, G., 2014. Forecasting of monthly river flow with autoregressive modeling and Terzi, O data-driven techniques. Neural Comput. Appl. 25, 179e188. https://doi.org/10.1007/s00521013-1469-9.

Streamflow forecasting Chapter | 1

47

Tesfaye, Y.G., Meerschaert, M.M., Anderson, P.L., 2006. Identification of periodic autoregressive moving average models and their application to the modeling of river flows. Water Resour. Res. 42 (1) https://doi.org/10.1029/2004WR003772. Thirumalaiah, K., Deo, M.C., 1998. Real-time flood forecasting using neural networks. Comput. Aided Civ. Infrastruct. Eng. 13, 101e111. https://doi.org/10.1111/0885-9507.00090. Thirumalaiah, K., Deo, M.C., 2000. Hydrological forecasting using neural networks. J. Hydrol. Eng. ASCE 5 (2), 180e189. https://doi.org/10.1061/(ASCE)1084-0699(2000)5:2(180). Tikhamarine, Y., Souag-Gamane, D., Ahmed, A.N., Kis¸i, O., El-Shafie, A., 2020. Improving artificial intelligence models accuracy for monthly streamflow forecasting using grey Wolf optimization (GWO) algorithm. J. Hydrol. 582, 124435. https://doi.org/10.1016/ j.jhydrol.2019.124435. Tiwari, M.K., Chatterjee, C., 2011. A new waveletebootstrapeANN hybrid model for daily discharge forecasting. J. Hydroinf. 13 (3), 500e519. https://doi.org/10.2166/HYDRO.2010.142. Tong, H., 1990. Non-Linear Time Series: A Dynamical System Approach. Oxford University Press, Oxford, UK. Torabi, S.A., Mastouri, R., Najarchi, M., 2020. Daily flow forecasting of perennial rivers in an arid watershed: a hybrid ensemble decomposition approach integrated with computational intelligence techniques. J. Water Supply Res. Technol. - Aqua 69 (6), 555e577. https://doi.org/ 10.2166/aqua.2020.138. Toro, C.H.F., Meire, S.G., Ga´lvez, J.F., Fdez-Riverola, F., 2013. A hybrid artificial intelligence model for river flow forecasting. Appl. Soft Comput. 13 (8), 3449e3458. https://doi.org/ 10.1016/j.asoc.2013.04.014. Toth, E., Brath, A., 2007. Multistep ahead streamflow forecasting: role of calibration data in conceptual and neural network modeling. Water Resour. Res. 43 (11) https://doi.org/10.1029/ 2006WR005383. Toth, E., Brath, A., Montanari, A., 2000. Comparison of short-term rainfall prediction models for real-time flood forecasting. J. Hydrol. 239 (1), 132e147. https://doi.org/10.1016/S00221694(00)00344-9. Trenberth, K.E., 2001. Climate variability and global warming. Science 293 (5527), 48e49. https://doi.org/10.1126/science.293.5527.48. Turan, M.E., 2016. Fuzzy systems tuned by swarm based optimization algorithms for predicting stream flow. Water Resour. Manag. 30 (12), 4345e4362. https://doi.org/10.1007/s11269-0161424-5. Valipour, M., 2015. Long-term runoff study using SARIMA and ARIMA models in the United States. Meteorol. Appl. 22 (3), 592e598. https://doi.org/10.1002/met.1491. Valipour, M., Banihabib, M.E., Behbahani, S.M.R., 2013. Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez dam reservoir. J. Hydrol. 476, 433e441. https://doi.org/10.1016/j.jhydrol.2012.11.017. Valipour, M., Montazar, A., 2012. Optimize of all effective infiltration parameters in furrow irrigation using visual basic and genetic algorithm programming. Aust. J. Basic Appl. Sci. 6, 132e137. Valipour, M., Montazar, A., 2012. Sensitive analysis of optimized infiltration parameters in SWDC model. Adv. Environ. Biol. 6, 2574e2581. Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer-Verlag New York, Inc., New York, NY, USA. Vogel, R.M., Shallcross, A.L., 1996. The moving blocks bootstrap versus parametric time series models. Water Resour. Res. 32 (6), 1875e1882. https://doi.org/10.1029/96WR00928.

48 Advances in Streamflow Forecasting Wang, W.C., Chau, K.W., Cheng, C.T., Qiu, L., 2009. A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series. J. Hydrol. 374 (3e4), 294e306. https://doi.org/10.1016/j.jhydrol.2009.06.019. Wang, W.C., Chau, K.W., Xu, D.M., Chen, X.Y., 2015. Improving forecasting accuracy of annual runoff time series using ARIMA based on EEMD decomposition. Water Resour. Manage. 29 (8), 2655e2675. https://doi.org/10.1007/s11269-015-0962-6. Wang, X.X., Chen, S., Lowe, D., Harris, C.J., 2006. Sparse support vector regression based on orthogonal forward selection for the generalised kernel model. Neurocomputing 70, 462e474. https://doi.org/10.1016/j.neucom.2005.12.129. Wang, Z., Fathollahzadeh Attar, N., Khalili, K., Behmanesh, J., Band, S.S., Mosavi, A., Chau, K.W., 2020. Monthly streamflow prediction using a hybrid stochastic-deterministic approach for parsimonious non-linear time series modeling. Eng. Applicat. Comput. Fluid Mech. 14 (1), 1351e1372. https://doi.org/10.1080/19942060.2020.1830858. Wang, W., Gelder, P.H.A.J., Vrijling, J.K., Ma, J., 2005. Testing and modelling autoregressive conditional heteroskedasticity of streamflow processes. Nonlinear Process. Geophy. 12 (1), 55e66. https://doi.org/10.5194/npg-12-55-2005. Wang, Z.Y., Qiu, J., Li, F.F., 2018. Hybrid models combining EMD/EEMD and ARIMA for Longterm streamflow forecasting. Water 10 (7), 853. https://doi.org/10.3390/w10070853. Wang, W., Van Gelder, P.H.A.J.M., Vrijling, J.K., Ma, J., 2006. Forecasting daily streamflow using hybrid ANN models. J. Hydrol. 324, 383e399. https://doi.org/10.1016/j.jhydrol.2005.09.032. Wang, E., Zhang, Y., Luo, J., Chiew, F.H., Wang, Q.J., 2011. Monthly and seasonal streamflow forecasts using rainfall-runoff modeling and historical weather data. Water Resour. Res. 47 (5) https://doi.org/10.1029/2010WR009922. Wei, S., Yang, H., Song, J., Abbaspour, K., Xu, Z., 2013. A wavelet-neural network hybrid modelling approach for estimating and predicting river monthly flows. Hydrol. Sci. J. 58 (2), 374e389. https://doi.org/10.1080/02626667.2012.754102. Wen, L., 2009. Reconstruction natural flow in a regulated system, the Murrumbidgee River, Australia, using time series analysis. J. Hydrol. 364, 216e226. https://doi.org/10.1016/ j.jhydrol.2008.10.023. Wiche, G.J., Holmes Jr., R.R., 2016. Streamflow data, chapter 13. In: Admas III, T.E., Pagano, T.C. (Eds.), Flood Forecasting - A Global Perspective. Elsevier Inc., pp. 371e398 Winters, P.R., 1960. Forecasting sales by exponentially weighted moving averages. Manag. Sci. 6 (3), 324e342. https://doi.org/10.1287/mnsc.6.3.324. Wu, C.L., Chau, K.W., 2010. Data-driven models for monthly streamflow time series prediction. Eng. Appl. Artif. Intell. 23 (8), 1350e1367. https://doi.org/10.1016/j.engappai.2010.04.003. Wu, C.L., Chau, K.W., Li, Y.S., 2009. Predicting monthly streamflow using data-driven models coupled with data-preprocessing techniques. Water Resour. Res. 45, W08432. https://doi.org/ 10.1029/2007WR006737. Wu, J., Zhou, J., Chen, L., Ye, L., 2015. Coupling forecast methods of multiple rainfall- runoff models for improving the precision of hydrological forecasting. Water Resour. Manage. 29 (14), 5091e5108. https://doi.org/10.1007/s11269-015-1106-8. Xie, H., Li, D., Xiong, L., 2016. Exploring the regional variance using ARMA-GARCH models. Water Resour. Manag. 30 (10), 3507e3518. https://doi.org/10.1007/s11269-016-1367-x. Xie, T., Zhang, G., Hou, J., Xie, J., Lv, M., Liu, F., 2019. Hybrid forecasting model for nonstationary daily runoff series: a case study in the Han River Basin, China. J. Hydrol. 577, 123915. https://doi.org/10.1016/j.jhydrol.2019.123915. Xing, B., Gan, R., Liu, G., Liu, Z., Zhang, J., Ren, Y., 2016. Monthly mean streamflow prediction based on bat algorithm-support vector machine. J. Hydrol. Eng. ASCE 21 (2), 04015057. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001269.

Streamflow forecasting Chapter | 1

49

Xiong, L., Shamseldin, A.Y., O’Connor, K.M., 2001. A non-linear combination of the forecasts of rainfall-runoff models by the first-order Takagi-Sugeno fuzzy system. J. Hydrol. 245 (1e4), 196e217. https://doi.org/10.1016/S0022-1694(01)00349-3. Xu, C.Y., Xiong, L., Singh, V.P., 2019. Black-Box hydrological models. In: Duan, Q., Pappenberger, F., Thielen, J., Wood, A., Cloke, H., Schaake, J. (Eds.), Handbook of Hydrometeorological Ensemble Forecasting. Springer, Heidelberg, p. 1528. https://doi.org/10.1007/ 978-3-642-40457-3_21-1. Yaghoubi, B., Hosseini, S.A., Nazif, S., 2019. Monthly prediction of streamflow using data-driven models. J. Earth Syst. Sci. 128 (6), 1e15. https://doi.org/10.1007/s12040-019-1170-1. Yakowitz, S.J., 1979. A nonparametric Markov model for daily river flow. Water Resour. Res. 15 (5), 1035e1043. https://doi.org/10.1029/WR015i005p01035. Yang, T., Asanjan, A.A., Welles, E., Gao, X., Sorooshian, S., Liu, X., 2017. Developing reservoir monthly inflow forecasts using artificial intelligence and climate phenomenon information. Water Resour. Res. 53 (4), 2786e2812. https://doi.org/10.1002/2017WR020482. Yarar, A., 2014. A hybrid wavelet and neuro-fuzzy model for forecasting the monthly streamflow data. Water Resour. Manag. 28 (2), 553e565. https://doi.org/10.1007/s11269-013-0502-1. Yaseen, Z.M., Awadh, S.M., Sharafati, A., Shahid, S., 2018. Complementary data-intelligence model for river flow simulation. J. Hydrol. 567, 180e190. https://doi.org/10.1016/ j.jhydrol.2018.10.020. Yaseen, Z.M., Ebtehaj, I., Bonakdari, H., Deo, R.C., Mehr, A.D., Mohtar, W.H.M.W., Diop, L., ElShafie, A., Singh, V.P., 2017. Novel approach for streamflow forecasting using a hybrid ANFIS-FFA model. J. Hydrol. 554, 263e276. https://doi.org/10.1016/j.jhydrol.2017.09.007. Yaseen, Z.M., Ebtehaj, I., Kim, S., Sanikhani, H., Asadi, H., Ghareb, M.I., Bonakdari, H., Wan Mohtar, W.H.M., Al-Ansari, N., Shahid, S., 2019. Novel hybrid data-intelligence model for forecasting monthly rainfall with uncertainty analysis. Water 11 (3), 502. https://doi.org/ 10.3390/w11030502. Yaseen, Z.M., El-Shafie, A., Jaafar, O., Afan, H.A., Sayl, K.N., 2015. Artificial intelligence based models for stream-flow forecasting: 2000e2015. J. Hydrol. 530, 829e844. https://doi.org/ 10.1016/j.jhydrol.2015.10.038. Yaseen, Z.M., Fu, M., Wang, C., Mohtar, W.H.M.W., Deo, R.C., El-Shafie, A., 2018. Application of the hybrid artificial neural network coupled with rolling mechanism and grey model algorithms for streamflow forecasting over multiple time horizons. Water Resour. Manage. 32 (5), 1883e1899. https://doi.org/10.1007/s11269-018-1909-5. Yaseen, Z.M., Jaafar, O., Deo, R.C., Kis¸i, O., Adamowski, J., Quilty, J., El-Shafie, A., 2016. Stream-flow forecasting using extreme learning machines: a case study in a semi-arid region in Iraq. J. Hydrol. 542, 603e614. https://doi.org/10.1016/j.jhydrol.2016.09.035. Yaseen, Z.M., Kisi, O., Demir, V., 2016. Enhancing long-term streamflow forecasting and predicting using periodicity data component: application of artificial intelligence. Water Resour. Manage. https://doi.org/10.1007/s11269-016-1408-5. Yaseen, Z.M., Sulaiman, S.O., Deo, R.C., Chau, K.W., 2019a. An enhanced extreme learning machine model for river flow forecasting: state-of-the-art, practical applications in water resource engineering area and future research direction. J. Hydrol. 569, 387e408. https:// doi.org/10.1016/j.jhydrol.2018.11.069. Yu, P.S., Tseng, T.Y., 1996. A model to forecast flow with uncertainty analysis. Hydrol. Sci. J. 41 (3), 327e344. https://doi.org/10.1080/02626669609491506. Yu, X., Zhang, X., Qin, H., 2018. A data-driven model based on Fourier transform and support vector regression for monthly reservoir inflow forecasting. J. Hydro-environ. Res. 18, 12e24. https://doi.org/10.1016/j.jher.2017.10.005.

50 Advances in Streamflow Forecasting Yule, G.U., 1927. On the method of investigating periodicities in disturbed series, with special reference to Wolfer’s sunspot numbers. Philos. Trans. R. Soc. London, Ser. A 226, 267e298. https://doi.org/10.1098/rsta.1927.0007. Zadeh, L., 1965. Fuzzy sets. Inf. Contr. 8 (3), 338e353. https://doi.org/10.1142/ 9789814261302_0021. Zealand, C.M., Burn, D.H., Simonovic, S.P., 1999. Short term streamflow forecasting using artificial neural networks. J. Hydrol. 214 (1e4), 32e48. https://doi.org/10.1016/S0022-1694(98) 00242-X. Zha, X., Xiong, L., Guo, S., Kim, J.S., Liu, D., 2020. AR-GARCH with exogenous variables as a postprocessing model for improving streamflow forecasts. J. Hydrol. Eng. ASCE 25 (8), 04020036. https://doi.org/10.1061/(ASCE)HE.1943-5584.0001955. Zhang, Di, Lin, J., Peng, Q., Wang, D., Yang, T., Sorooshian, S., Liu, X., Zhuang, J., 2018. Modeling and simulating of reservoir operation using the artificial neural network, support vector regression, deep learning algorithm. J. Hydrol. 565, 720e736. https://doi.org/10.1016/ j.jhydrol.2018.08.050. Zhang, X., Peng, Y., Zhang, C., Wang, B., 2015. Are hybrid models integrated with data preprocessing techniques suitable for monthly streamflow forecasting? Some experiment evidences. J. Hydrol. 530, 137e152. https://doi.org/10.1016/j.jhydrol.2015.09.047. Zhang, H., Singh, V.P., Wang, B., Yu, Y., 2016. CEREF: a hybrid data-driven model for forecasting annual streamflow from a socio-hydrological system. J. Hydrol. 540, 246e256. https://doi.org/ 10.1016/j.jhydrol.2016.06.029. Zhang, Q., Wang, B.D., He, B., Peng, Y., Ren, M.L., 2011. Singular spectrum analysis and ARIMA hybrid model for annual runoff forecasting. Water Resour. Manage. 25 (11), 2683e2703. https://doi.org/10.1007/s11269-011-9833-y. Zhang, Z., Zhang, Q., Singh, V.P., 2018. Univariate streamflow forecasting using commonly used data-driven models: literature review and case study. Hydrol. Sci. J. 63 (7), 1091e1111. https://doi.org/10.1080/02626667.2018.1469756. Zhou, J., Peng, T., Zhang, C., Sun, N., 2018. Data pre-analysis and ensemble of various artificial neural networks for monthly streamflow forecasting. Water 10 (5), 628. https://doi.org/ 10.3390/w10050628. Zuo, G., Luo, J., Wang, N., Lian, Y., He, X., 2020. Decomposition ensemble model based on variational mode decomposition and long short-term memory for streamflow forecasting. J. Hydrol. 585, 124776. https://doi.org/10.1016/j.jhydrol.2020.124776.

Chapter 2

Streamflow forecasting at large time scales using statistical models Hristos Tyralis1, Georgia Papacharalampous2, Andreas Langousis3 1

Air Force Support Command, Hellenic Air Force, Elefsina Air Base, Elefsina, Greece; Department of Water Resources and Environmental Engineering, School of Civil Engineering, National Technical University of Athens, Iroon Polytechniou 5, Zografou, Greece; 3 Department of Civil Engineering, School of Engineering, University of Patras, University Campus, Rio, Patras, Greece 2

2.1 Introduction Delivering accurate streamflow forecasts at large time scales, i.e., monthly or annual, is important for water resources management. As time scales get larger, linear and Gaussian patterns start to prevail in geophysical and hydrological processes, such as streamflow discharges and precipitation, and statistical models become accurate enough in forecasting a univariate time series. The existing statistical models widely used for forecasting hydrological time series can be classified into two categories: (a) exponential smoothing (ETS) and (b) autoregressive moving average (ARMA) and other similar models (De Gooijer and Hyndman, 2006). While classical ARMA models have been extensively used for streamflow forecasting, the use of ETS has been limited in hydrological forecasting (Papacharalampous et al., 2019a). Some early examples from both categories of statistical models for precipitation and streamflow forecasting can be found in Carlson et al. (1970) for ARMA models and Dyer (1977) for ETS, while there exist other relevant studies that belong to the broader subject of stochastic hydrology (Kazmann, 1964; Matalas, 1967; Yevjevich, 1968, 1987; Scheidegger, 1970; Beaumont, 1979). In the literature of time series forecasting, both types of models have been exploited extensively; for example, see Box et al. (2015) for ARMA models and Gardner (1985, 2006) for ETS. In the literature related to forecasting, it has become a tradition to apply a variety of methods to big datasets aiming at appropriate benchmarking and better understanding of the properties of the proposed algorithms. This has Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00004-4 Copyright © 2021 Elsevier Inc. All rights reserved.

51

52 Advances in Streamflow Forecasting

been achieved (to some extent) through forecasting competitions, where multiple teams compete to deliver accurate forecasts for a given dataset (Hyndman, 2020). Such competitions have promoted the common knowledge that new methods should be tested in large datasets, and this knowledge has been transferred to the publication tradition in the field of forecasting, which is largely based on empirical results from large-scale studies. However, in the field of hydrology, the subject of forecasting advanced through numerous small-scale applications as is evident from a recent review presented by Papacharalampous et al. (2019a), except for some early large-scale studies, i.e., forecasting 157 annual rainfall time series (Dyer, 1977) and 30 monthly streamflow series (Noakes et al., 1985). Very recently, some large-scale studies were conducted for forecasting of 297 annual temperature and precipitation, 2537 monthly temperature and precipitation, and 405 annual streamflow time series (Papacharalampous et al., 2018a,b, 2019a). This chapter focuses on point forecasting with statistical models. The aims of this chapter are (i) to present a literature review on forecasting with statistical models in general and in the field of hydrology in particular, (ii) to describe the relevant theory of simple statistical models, and (iii) to demonstrate and apply statistical models to a big dataset of time series, in order to demonstrate their use and provide an understanding on the behavior of such algorithms. Several textbooks on the subject of forecasting exist; therefore, we also aim to popularize the existing knowledge to hydrologists and connect the relevant literature with the current practice. Literature and large-scale applications of machine learning algorithms to forecast geophysical processes can be found in Papacharalampous et al. (2018c, 2019a), Tyralis and Papacharalampous (2017), and Tyralis et al. (2019b), among others. The literature also focuses on the use of process-based models for streamflow forecasting (Blo¨schl et al., 2019), but those models are out of the scope of the present chapter. Large-scale studies, where process-based and machine learning algorithms are combined, can be found in Papacharalampous et al. (2019b, 2020a,b) and Tyralis et al. (2019a), while automated methods applied to forecasting of hydrological time series can be found in Papacharalampous and Tyralis (2018) and Tyralis and Papacharalampous (2018).

2.2 Overview of statistical models used in forecasting 2.2.1 Forecasting in general The major focus of this section is on presenting a review of autoregressive integrated moving average (ARIMA) and ETS models as well as their variants. We focus on forecasting rather than time series analysis or simulation themes. A formal presentation of the models discussed in this section is provided in the following subsections.

Streamflow forecasting at large time scales using statistical models Chapter | 2

53

A time series is a set of chronologically ordered observations. Let z ¼ fz1 ; .; zn g

(2.1)

be a time series of n values, observed at equispaced time intervals, where (:¼) denotes definition of a variable or function. We are interested in forecasting znþl, l ˛ {1, .}, given z. One-step ahead forecasting is the case when l ¼ 1, while multi-step ahead forecasting refers to the case when l > 1.

2.2.1.1 ARIMA models Box and Jenkins (1970) presented a family of linear stochastic processes termed as ARIMA models. The book by Box and Jenkins (1970), which builds on earlier works starting from Yule (1927) and Wold (1938), is an excellent compilation of the detailed information on stochastic modeling that practically popularized ARIMA models. ARIMA stochastic processes can be used to model a time series. In principle, they are fitted to the time series, and then the fitted model is used for forecasting purposes. Natural extensions of ARIMA models are autoregressive fractionally integrated moving average (ARFIMA) models (Granger and Joyeux, 1980; Hosking, 1981), which can model long-range dependence. Long-range dependence can also be modeled by fractional Gaussian noise (fGn) (Kolmogorov, 1940; Hurst, 1951; Mandelbrot and Van Ness, 1968; Veneziano and Langousis, 2010; Beran et al., 2013), which is another stochastic process of interest. Seasonal autoregressive integrated moving average (SARIMA) models are extensions of ARIMA models, which can model seasonality present in the time series (Box et al., 2015). 2.2.1.2 Exponential smoothing models The concept of ETS was originated by Brown (1959), Winters (1960), and Holt (2004). Simple exponential smoothing (SES) models are weighted moving averages. However, the ETS family is optimal for a very general class of state-space models, and therefore, it is broader than the ARIMA class (Gardner, 2006). Beyond SES, several ETS models exist, which can forecast time series with trends and seasonality. A classification of ETS models, in the context of forecasting, is presented in Gardner (2006). 2.2.1.3 General literature Several classical books provide comprehensive overviews and information on statistical forecasting methods. Most of them also address other issues, e.g., time series analysis and general themes on forecasting (Table 2.1). Most books address both time series analysis and time series forecasting themes, whereas there are books which specialize in forecasting or certain time series models (e.g., models with long-range dependence or seasonality).

54 Advances in Streamflow Forecasting

TABLE 2.1 Prominent books providing a comprehensive overview of statistical forecasting methods. S. No.

References

Theme

1

Box and Jenkins (1970)

ARIMA models, large part is devoted to time series modeling

2

Brockwell and Davies (1991)

Introduction to time series analysis and forecasting methods

3

Hipel and McLeod (1994)

Time series analysis and forecasting with statistical models in hydrology

4

Armstrong (2001)

Presentation of principles of forecasting

5

Hyndman et al. (2008)

A book specializing in exponential smoothing models

6

Machiwal and Jha (2012)

A book dealing with theory as well as practical case studies of hydrologic time series analysis

7

Beran et al. (2013)

A book specializing in modeling long-range dependence

8

Box et al. (2015)

Introduction to time series analysis and forecasting methods

9

Brockwell and Davies (2016)

Introduction to time series analysis and forecasting methods

10

Dagum and Bianconcini (2016)

A book specializing in seasonal adjustment methods

11

De Gooijer (2017)

Nonlinear statistical models for time series analysis and forecasting

12

Hyndman and Athanasopoulos (2018)

A book specializing in forecasting

Table 2.2 enlists some excellent review papers on forecasting methods with their themes ranging from ETS models to general forecasting methods. Likewise, Table 2.3 summarizes a few key papers explaining the history of time series analysis or analysis with ARIMA and other related models that include ARFIMA, SARIMA, and fGn. In addition, prominent studies dealing with the application of ETS models for forecasting are presented in Table 2.4. The studies are related to introductory work as well as the connection between ETS and state-space models.

2.2.1.4 Literature in hydrology Indicative literature on the use of ARIMA and related models in hydrology is summarized in Table 2.5, which includes early forecasting studies, some

Streamflow forecasting at large time scales using statistical models Chapter | 2

55

TABLE 2.2 Review articles on exponential smoothing and other forecasting models. S. No.

References

Theme

1

Gardner (1985)

Review of exponential smoothing models

2

Chatfield (1988)

Review of forecasting methods

3

Gardner (2006)

Review of exponential smoothing models

4

De Gooijer and Hyndman (2006)

Review of forecasting methods

5

Hyndman (2019)

Review of forecasting competitions

TABLE 2.3 Key literature describing time series analysis and forecasting with autoregressive integrated moving average (ARIMA) models and their extensions. S. No.

References

Theme

1

Yule (1927)

Roots for ARIMA modeling

2

Wold (1938)

Roots for ARIMA modeling

3

Kolmogorov (1940)

Introduction of fractional Brownian motion (fBm) (fGn [fractional Gaussian noise] is increments of fBm)

4

Hurst (1951)

First who noticed the Hurst phenomenon when investigating streamflow time series

5

Hurst (1956)

First who noticed the Hurst phenomenon when investigating streamflow time series

6

Mandelbrot and Van Ness (1968)

Expository work on fGn

7

Granger (1978)

Introduction of ARFIMA models

8

Granger and Joyeux (1980)

Introduction of ARFIMA models

9

Hosking (1981)

Introduction of ARFIMA models

10

De Gooijer et al. (1985)

Methods for determining the order of ARMA models

11

Peiris and Perera (1988)

Forecasting with ARFIMA models

Continued

56 Advances in Streamflow Forecasting

TABLE 2.3 Key literature describing time series analysis and forecasting with autoregressive integrated moving average (ARIMA) models and their extensions.dcont’d S. No.

References

Theme

12

Liu (1989)

Identification of SARIMA models

13

Cox (1991)

Long-range dependence and nonlinearity

14

Franses and Ooms (1997)

SARIMA model with varying long-range dependence magnitude

15

Graves et al. (2017)

A review of ARFIMA models

TABLE 2.4 Literature dealing with forecasting through exponential smoothing models. S. No.

References

Theme

1

Brown (1959)

Introduction of exponential smoothing models

2

Winters (1960)

Advancement of exponential smoothing models

3

Pegels (1969)

Advancement of exponential smoothing models

4

Gardner and McKenzie (1985)

Advancement of exponential smoothing models

5

Ord et al. (1997)

Equivalence between exponential smoothing and state-space models

6

Hyndman et al. (2002)

Equivalence between exponential smoothing and state-space models

7

Taylor (2003)

Advancement of exponential smoothing models

8

Holt (2004)

Introduction of exponential smoothing models

9

De Livera et al. (2011)

Automatic forecasting with exponential smoothing models

representative modeling papers, and studies dealing with big datasets. Most of the hydrological studies focus on time series modeling with fGn, albeit ARIMA models have also been used widely in hydrology. Regarding the use of big datasets, some early forecasting studies included 30 and 150 time series, while studies including a much larger number (i.e., more than 2500 time series) appeared only very recently in the literature. Table 2.6 presents studies involving applications of ETS models for forecasting hydrological time series. It is surprising that only a few studies utilized ETS for forecasting hydrological variables, despite its successful use in other fields.

Streamflow forecasting at large time scales using statistical models Chapter | 2

57

TABLE 2.5 Summary of literature presenting time series analysis and forecasting with autoregressive integrated moving average (ARIMA) models and their extensions in hydrology. S.No.

References

Theme

1

Mandelbrot and Wallis (1968)

Introduction of fractional Gaussian noise (fGn) in hydrology

2

Carlson et al. (1970)

Forecasting of one annual streamflow time series

3

Klemes (1974)

Investigation of the Hurst phenomenon in hydrology

4

Dyer (1977)

Forecasting of 157 annual rainfall time series using AR models

5

Hipel et al. (1977a,b)

ARIMA models in hydrology

6

Ledolter (1978)

SARIMA models applied in hydrology

7

McLeod and Hipel (1978a,b,c)

Investigation of the Hurst phenomenon in hydrology

8

Noakes et al. (1985)

Forecasting of 30 monthly streamflow time series

9

Noakes et al. (1988)

Forecasting of five annual temperature and streamflow time series

10

Tyralis and Koutsoyiannis (2011)

Comparison of multiple estimators of the fGn parameters

11

Tyralis and Koutsoyiannis (2014)

Probabilistic forecasting for fGn. Application to three annual temperature, precipitation and streamflow time series

12

Vafakhah et al. (2017)

A book chapter related to precipitation forecasting

13

Papacharalampous et al. (2018a)

Forecasting of 297 annual temperature and precipitation time series using ARFIMA models

14

Papacharalampous et al. (2018b)

Forecasting of 2537 monthly temperature and precipitation time series using seasonal decomposition and ARFIMA models

15

Papacharalampous et al. (2019a)

Forecasting of 405 annual streamflow time series using ARFIMA models

2.3 Theory In this section, we present some theoretical aspects on the application of ARIMA and ETS models for time series forecasting. We note that an extended review on the subject is presented in books presented in the previous section.

58 Advances in Streamflow Forecasting

TABLE 2.6 Literature on forecasting with exponential smoothing models in hydrology. S. No.

References

Theme

1

Dyer (1977)

Forecasting of 157 annual rainfall time series using exponential smoothing models

2

Hong and Pai (2007)

Forecasting of a hourly rainfall time series

3

Murat et al. (2016)

Forecasting of eight temperature and precipitation time series

4

Papacharalampous et al. (2018a)

Forecasting of 297 annual temperature and precipitation time series using exponential smoothing models

5

Papacharalampous et al. (2018b)

Forecasting of 2537 monthly temperature and precipitation time series using seasonal decomposition and exponential smoothing models

6

Papacharalampous et al. (2019a)

Forecasting of 405 annual streamflow time series using exponential smoothing models

2.3.1 ARIMA models 2.3.1.1 Definition Let z denote an ARIMA stochastic process used to model the time series z in Eq. (2.1). The ARIMA(p, d, q) process is defined by (Brockwell and Davis, 2016) 4p ðBÞð1  BÞd z t ¼ qq ðBÞat

(2.2)

where d is the integer order of differencing (i.e., a natural number), p is the integer order of the autoregressive part of the model, q is the integer order of the moving average part of the model, and B is the backshift operator (see Eqs. 2.3 and 2.4), i.e., Bz t :¼ z t1

(2.3)

and Bm z t :¼ z tm ;

m ¼ 2; 3; .

(2.4)

where m is time lag. The stationary autoregressive (AR) operator is defined as 4p ðBÞ:¼ 1  4 B  /  4p Bp

(2.5)

and the invertible moving average (MA) operator is defined as qq ðBÞ: ¼ 1  q1 B  /  qq Bq

(2.6)

Streamflow forecasting at large time scales using statistical models Chapter | 2

59

where at is a white noise Gaussian process, i.e., a sequence of stochastically independent random variables, with marginal distribution   at wN 0; s2a (2.7) Eq. (2.2) defines the general ARIMA model. When d ¼ 0, Eq. (2.2) defines the ARMA( p, q) process. ARMA(p, q) reduces to an autoregressive process for q ¼ 0, and a moving average process for p ¼ 0. For instance, after applying the backshift operator, the ARIMA(1, 1, 1) process is defined by (Brockwell and Davis, 2016)   zt : ¼ 1 þ 41 zt1  41 zt2 þ at  q1 at1 (2.8) The ARIMA(p, d, q) process has p þ q þ 3 parameters, i.e., the parameters of the AR and the MA operators, the value of the d parameter, and the mean m and standard deviation s of the process. ARFIMA models are defined by Eq. (2.2) with d taking fractional values in the range (0.5, 0.5). SARIMA models are obtained by seasonally varying the parameters of ARIMA models. ARIMA variants can also model deterministic trends (see, e.g., Box et al., 2015).

2.3.1.2 Forecasting with ARIMA models When forecasting using ARIMA models, three steps are generally followed: (i) choosing the correct order of the model, (ii) estimating the model parameters, and (iii) defining a forecasting formula with emphasis in minimizing a metric. Most frequently, the root mean square error (RMSE) is used as the forecasting performance metric (Hyndman and Koehler, 2006). Steps (i) and (ii) have been extensively investigated in the literature; however, there is not an exact solution to the problem introduced in the third step. Estimation of the order of the models is possible, e.g., by comparing values of the Akaike information criterion of the fitted models using a maximum likelihood estimation procedure (Akaike, 1974). Other procedures also exist, e.g., via estimation of sample auto-correlations (Bartlett, 1946). In general, automated methods are preferred for this task (Ord and Lowe, 1996; Hyndman and Khandakar, 2008) aiming to provide a large number of forecasts with minimal user intervention. ARIMA models are applied under the assumption of normality (Gaussianity) (Brockwell and Davis, 2016). While this assumption is reasonable for annual time scales, it may not be adequate at monthly time scales. In this case, some transformations should be applied to the data before fitting an ARIMA model (Box and Cox, 1964). It should be noted that it is not the data that is normal (Gaussian), but the model. Furthermore, the appropriate kind of transformation depends on the dataset at hand, and there are not general guidelines available for this purpose. In the case of monthly models, seasonal decomposition should be performed before fitting ARIMA models (Cleveland et al., 1990; Canova and Hansen, 1995; Theodosiou, 2011). In case the data are not decomposed, a SARIMA model should be used instead.

60 Advances in Streamflow Forecasting

Once the forecasting model is fitted, an l-step ahead forecast that minimizes the expected squared error can be obtained by following the next four steps (Hyndman and Athanasopoulos, 2018): Step 1: Repeat for l ¼ 1, .. Step 2: Expand Eq. (2.2) so that zt is on the left side and other terms are on the right side. Step 3: Replace t with n þ l. Step 4: Replace terms on the right side of Eq. (2.2) as follows: future observations with their forecasts, future errors with zeros, and past errors with the corresponding residuals.

2.3.2 Exponential smoothing models Among several ETS models, the fundamental one is SES. In one-step ahead forecasting, an SES forecast is obtained through weighted averaging of the last observation and its forecast. Let ynþl|n be the (l-steps ahead) forecast of znþl. The one-step ahead forecast of the SES method is given by (e.g., Hyndman and Athanasopoulos, 2018): ynþ1jn ¼ azn þ ð1  aÞynjn1

(2.9)

where yn|ne1 is the forecast of zn at time n  1. By iterating Eq. (2.9), we obtain ynþ1jn ¼

n1 X

að1  aÞ j znj þ ð1  aÞn b

(2.10)

j¼0

Eq. (2.10) includes two parameters, i.e., a and b, which can be estimated by minimizing the mean squared error of the fitted model in the observed time series. More advanced methods (e.g., maximum likelihood estimation) also exist. Besides SES, other ETS models also exist, which can incorporate seasonality, trends, and damped trends (i.e., trends that are dampened as the forecast horizon increases). A taxonomy of SES models proposed by Taylor (2003) is presented in Table 2.7. Variations of the SES model considering trends, seasonality, or both characteristics and all possible combinations are shown in Table 2.7. State-space models underlie ETS models, i.e., each ETS model has two corresponding variants, one with additive and one with multiplicative errors (Hyndman et al., 2008). Automatic model selection can be done using information criteria (e.g., the Akaike information criterion).

2.4 Large-scale applications at two time scales In this section, we present two large-scale applications of ARFIMA models for streamflow forecasting in the contiguous US. The first modeling application is

Streamflow forecasting at large time scales using statistical models Chapter | 2

61

TABLE 2.7 Classification of exponential smoothing methods. Seasonal component Trend component

None (N)

Additive (A)

Multiplicative (M)

None (N)

(N, N)

(N, A)

(N, M)

Additive (A)

(A, N)

(A, A)

(A, M)

Additive damped (AD)

(AD, N)

(AD, A)

(AD, M)

Multiplicative (M)

(M, N)

(M, A)

(M, M)

Multiplicative damped (MD)

(MD, N)

(MD, A)

(MD, M)

Adapted from Taylor, J.W., 2003. Exponential smoothing with a damped multiplicative trend. Int. J. Forecast. 19 (4), 715e725. https://doi.org/10.1016/S0169-2070(03)00003-7

FIGURE 2.1 Locations of the 270 stations used in the present study. The data is sourced from Schaake et al. (2006).

conducted at annual time scale and the second one at monthly time scale. In both applications, we applied automatic time series forecasting methods to streamflow discharge data originating from the Model Parameter Estimation Experiment (MOPEX) dataset (Schaake et al., 2006), specifically from 270 MOPEX stations; station locations are shown in Fig. 2.1. For all stations, daily time series data available for the 50-year period (1950e99) were aggregated to form a 50-year long times series of monthly and annual streamflow. Both applications were made in multi-step ahead forecasting mode. The analyses and visualizations were performed in R programming language (R Core Team, 2019). The following contributed R packages have been used: devtools (Wickham et al., 2019c), dplyr (Wickham et al., 2019b), forecast (Hyndman and Athanasopoulos, 2018; Hyndman et al., 2019),

62 Advances in Streamflow Forecasting

gdata (Warnes et al., 2017), ggplot2 (Wickham, 2016; Wickham et al., 2019a), hddtools (Vitolo, 2017, 2018), HKprocess (Tyralis, 2016), hydroGOF (Zambrano-Bigiarini, 2017), knitr (Xie, 2014, 2015, 2019), maps (Brownrigg et al., 2018), matrixStats (Bengtsson, 2019), Metrics (Hamner and Frasco, 2018), reshape2 (Wickham, 2007, 2017), rmarkdown (Allaire et al., 2019), and zoo (Zeileis and Grothendieck, 2005; Zeileis et al., 2019). Application of the forecasting methods was done in two subsequent stages. First, the models were fitted according to Table 2.8. Then, the fitted models were used to produce the forecasts, using the forecast function of the forecast R package. This function was implemented with its arguments set to the default values.

2.4.1 Application 1: multi-step ahead forecasting of 270 time series of annual streamflow In this section, we present an application of four statistical time series forecasting methods, i.e., ARFIMA, error, trend, seasonal exponential smoothing (ETS, no trend), ETS (non-damped trend), and ETS (damped trend) (see Table 2.8 on the different types of the ETS methods that account for trends and seasonality), to 270 streamflow time series of annual streamflow, each consisting of 50 values. The auto-correlation function and partial auto-correlation function estimates of the exploited time series are summarized in Fig. 2.2, while estimates of the Hurst parameter (H) of the fGn process are summarized in Fig. 2.3. The H parameter ranges within (0, 1) and can serve as a measure of long-range dependencies (i.e., H ¼ 0.5 corresponds to a white noise Gaussian process). We observe that the magnitude of long-range dependencies in the 270 analyzed time series of annual streamflow is, on average, significant. Implementation of the four statistical models used in this study is described in Table 2.8. None of the ETS methods has a seasonal component, while two of them have an additive trend component. The simplest one (with no trend component) is SES. The methods are fitted to the first 45 values of each streamflow time series and then applied to forecast its last five values; i.e., to produce 5-year ahead annual forecasts. They are benchmarked using the “naı¨ve” method, a method setting all forecasted values equal to the last observation of the fitting set, i.e., equal to the 45th value of the time series. Values of the parameters of the fitted models are presented in Figs. 2.4e2.6. Most of the fitted ARFIMA models do not have any AR and MA components, and their d parameter is very close to zero. Very small values were also obtained for the a parameter of the ETS (no trend) model. Forecasting examples are presented in Fig. 2.7. It is important to note that becaue all ARFIMA and ETS methods aim at minimizing the forecast RMSE, they produce almost constant valued multi-step ahead forecasts. Let En be the error of the forecast at time n: En : ¼ fn  on

(2.11)

Streamflow forecasting at large time scales using statistical models Chapter | 2

63

TABLE 2.8 Details on the implementation of the forecasting methods. All R functions were implemented with their arguments set to the default values, unless specified differently. Application

Method

Fitting R function

Implementation notes

Application 1 (see Section 2.4.1)

ARFIMA

forecast::arfima

(y, estim ¼ “mle”, ic ¼ “aic”)

ETS (no trend)

forecast::ets

(y, model ¼ “ANN”, damped ¼ FALSE, additive.only ¼ TRUE, opt.crit ¼ “mse”, ic ¼ “aic”, restrict ¼ FALSE, allow.multiplicative.trend ¼ FALSE)

ETS (nondamped trend)

forecast::ets

(y, model ¼ “AAN”, damped ¼ FALSE, additive.only ¼ TRUE, opt.crit ¼ “mse”, ic ¼ “aic”, restrict ¼ FALSE, allow.multiplicative.trend ¼ FALSE)

ETS (damped trend)

forecast::ets

(y, model ¼ “AAN”, damped ¼ TRUE, additive.only ¼ TRUE, opt.crit ¼ “mse”, ic ¼ “aic”, restrict ¼ FALSE, allow.multiplicative.trend ¼ FALSE)

Seasonal ETS (no trend)

forecast::ets

(y, model ¼ “ANA”, damped ¼ FALSE, additive.only ¼ TRUE, opt.crit ¼ “mse”, ic ¼ “aic”, restrict ¼ FALSE, allow.multiplicative.trend ¼ FALSE)

Seasonal ETS (nondamped trend)

forecast::ets

(y, model ¼ “AAA”, damped ¼ TRUE, additive.only ¼ TRUE, opt.crit ¼ “mse”, ic ¼ “aic”, restrict ¼ FALSE, allow.multiplicative.trend ¼ FALSE)

Seasonal ETS (damped trend)

forecast::ets

(y, model ¼ “AAA”, damped ¼ FALSE, additive.only ¼ TRUE, opt.crit ¼ “mse”, ic ¼ “aic”, restrict ¼ FALSE, allow.multiplicative.trend ¼ FALSE)

Application 2 (see Section 2.4.2)

The argument was a numeric vector containing the fitting set for the application in Section 2.4.1 and an item of class ts with frequency equal to 12 for the application in Section 2.4.2. The fitting sets are defined in Section 2.4. ETS (type of trend) refers to corresponding version of exponential smoothing method in Table 2.7.

64 Advances in Streamflow Forecasting

(a)

(b)

FIGURE 2.2 Boxplots of the (A) auto-correlation function and (B) partial auto-correlation function estimates of the 270 time series of annual streamflow used in the present study. The data is sourced from Schaake et al. (2006).

FIGURE 2.3 Histogram of the H parameter estimates of fractional Gaussian noise for the 270 time series of annual streamflow used in the present study. Their median is denoted by a dashed line. The data is sourced from Schaake et al. (2006).

FIGURE 2.4 Orders p (AR component) and q (MA component) of the fitted ARFIMA models in Application 1.

FIGURE 2.5 Histogram of the orders d of differencing of the fitted ARFIMA models in Application 1. The median of d in the histogram is denoted by a dashed line.

FIGURE 2.6 Histogram of the values of parameter a of the fitted exponential smoothing (no trend) models in Application 1. The median of a in the histogram is denoted by a dashed line.

66 Advances in Streamflow Forecasting

FIGURE 2.7 Forecasting examples for Application 1: Forecasts for six streamflow time series of annual discharge totals.

Streamflow forecasting at large time scales using statistical models Chapter | 2

67

The performance of the models was assessed by computing the RMSE, mean absolute error (MAE), and median absolute error (MdAE) of their forecasts, as defined below.  1=2 X (2.12) RMSE : ¼ ð1=jNjÞ nEn2 MAE : ¼ ð1=jNjÞ

X

njEn j

(2.13)

MdAE : ¼ mediann fjEn jg

(2.14)

In above equations, n ˛ [1, N] denotes the serial number of a forecasted value, fn denotes the nth forecasted value, and on denotes the nth observed value in the forecast period. The computed metric values are summarized in Fig. 2.8. All ARFIMA and ETS models mostly produced forecasts with smaller RMSE, MAE, and MdAE values than the naı¨ve benchmarks. The difference in the values of performance evaluation criteria for the four applied models is small, with the ARFIMA model demonstrating the best performance in streamflow forecasts. Fig. 2.8 is informative to a limited extent. However, each forecasting case (defined by the forecasted time series) should be examined separately. Therefore, the rankings of the models according to their performance with respect to RMSE, MAE, and MdAE are presented in Figs. 2.9e2.11, respectively. It is observed that (i) the benchmark is mostly ranked fifth with respect to all metrics, (ii) ARFIMA, ETS (no trend), and ETS (damped trend)

(a)

(b)

(c)

FIGURE 2.8 Boxplots of the (A) root mean square error, (B) mean absolute error, and (C) median absolute error values in Application 1.

68 Advances in Streamflow Forecasting

FIGURE 2.9 Maps of rankings of (a) natı¨ve, (b) ARFIMA, (c) exponential smoothing (ETS) (no trend), (d) ETS (non-damped trend), and (e) ETS (damped trend) models, with respect to the root mean square error metric in Application 1. The models are ranked from best (first) to worst (fifth).

are competitive to each other, and (iii) ETS (non-damped trend) is often ranked fourth or fifth, i.e., it performs worse than ARFIMA and the remaining ETS models. The information presented in Figs. 2.9e2.11 is effectively summarized in Figs. 2.12 and 2.13. According to the latter figure, ARFIMA has the smallest (i.e., the best) mean ranking in terms of RMSE and MdAE and practically equal (slightly higher) mean ranking with ETS (no trend) in terms of MAE. The ETS (no trend) and ETS (damped trend) models have significantly smaller mean rankings than ETS (non-damped trend) model.

Streamflow forecasting at large time scales using statistical models Chapter | 2

FIGURE 2.10

69

Same as Fig. 2.9, but for the mean absolute error metric in Application 1.

Relative improvements of ARFIMA and ETS models were also computed with respect to the naı¨ve benchmark. This information is summarized in Fig. 2.14. It is observed that the computed relative improvements are mostly positive, taking values up to about 100%. However, very large (in absolute terms) negative relative improvements are also observed, in particular in terms of MdAE. Finally, the mean relative improvements in terms of RMSE, MAE, and MdAE of ARFIMA and ETS models are presented with respect to the naı¨ve benchmark (Fig. 2.15). All the mean relative improvements are positive,

70 Advances in Streamflow Forecasting

FIGURE 2.11 Same as Fig. 2.9, but for the median absolute error metric in Application 1.

except for the relative improvement of ETS (non-damped trend) in terms of MdAE. The largest mean relative improvements are introduced by ARFIMA. The second-best model is ETS (no trend) and the third-best is ETS (damped trend). The computed mean relative improvements ranged from 12% to 21% in terms of RMSE and MAE and from 2.5% to 13% in terms of MdAE.

2.4.2 Application 2: multi-step ahead forecasting of 270 time series of monthly streamflow In this section, we present an application of three statistical time series forecasting methods to 270 streamflow time series of monthly discharge totals.

Streamflow forecasting at large time scales using statistical models Chapter | 2

71

FIGURE 2.12 Number of times that each method was ranked from best (first) to worst (fifth) in Application 1 in terms of (a) root mean square error, (b) mean absolute error, and (c) median absolute error metrics.

72 Advances in Streamflow Forecasting

FIGURE 2.13 Mean rankings of the methods in Application 1 in terms of root mean square error (RMSE), mean absolute error (MAE), and median absolute error (MdAE) metrics.

FIGURE 2.14 Boxplots of the relative improvements of ARFIMA and exponential smoothing methods with respect to the naı¨ve benchmark in Application 1.

Streamflow forecasting at large time scales using statistical models Chapter | 2

73

FIGURE 2.15 Mean relative improvements of ARFIMA and exponential smoothing methods with respect to the naı¨ve benchmark in Application 1.

FIGURE 2.16 Histogram of a parameter values of the fitted seasonal exponential smoothing models (no trend) in Application 2. The median of the histogram is denoted by a dashed line.

Each of these time series contains 600 values. The assessed methods are named as seasonal ETS (no trend), seasonal ETS (non-damped trend), and seasonal ETS (damped trend). As indicated by their names, all methods have a seasonal component. Moreover, two of them have also an additive trend component. The models are fitted to the first 588 values of each time series (the a parameter values are summarized in Fig. 2.16) and then applied to forecast the 12 last values, i.e., to produce 1-year ahead monthly forecasts. They are benchmarked using the “seasonal naı¨ve” method. This method is based on the monthly values of the last year of the fitting dataset. For instance,

74 Advances in Streamflow Forecasting

the forecast for the month of August is equal to the value observed in the last August of the fitting dataset. In this case study, this observed value is the 584th value of the streamflow time series. Several forecasting examples are presented in Fig. 2.17. The seasonal ETS models often produce total monthly streamflow forecasts that are almost identical. This was observed very rarely for the nonseasonal ETS models in

FIGURE 2.17 Forecasting examples for Application 2: Forecasts for six streamflow time series of monthly discharge totals. The horizontal axis has been truncated in year 1990.

Streamflow forecasting at large time scales using statistical models Chapter | 2

75

Application 1 (see previous section), where annual streamflow data were used (Fig. 2.7). This finding indicates that the seasonal ETS models rarely find trends in the fitting datasets. The performance assessment is made by computing the RMSE, MAE, and MdAE of the forecasts. Boxplots were drawn to summarize the information obtained from the computed metrics (Fig. 2.18). It is observed that the seasonal ETS models perform significantly better than the seasonal naı¨ve approach with respect to all three metrics and are mostly comparable to each other. It is also observed that seasonal ETS (damped trend) exhibits slightly worse performance than the other two seasonal ETS models. The rankings of the models in terms of RMSE, MAE, and MdAE are presented in Figs. 2.19e2.21, respectively, for the analyzed stations. It is observed that the seasonal naı¨ve approach is mostly ranked fourth (with few exceptions) in terms of RMSE and MAE, whereas it ranks first for approximately half of the considered stations in terms of MdAE. The second worse model seems to be seasonal ETS (damped trend), whereas the other two seasonal ETS models are mostly comparable to each other in terms of rankings. The information presented in Figs. 2.19e2.21 is effectively summarized in Figs. 2.22 and 2.23. The relative improvements of the seasonal ETS models are also computed with respect to the seasonal naı¨ve benchmark. A summary of this information is given in Fig. 2.24. It is observed that the computed relative improvements are mostly positive. However, negative relative improvements are also observed mostly as outliers.

(a)

(b)

(c)

FIGURE 2.18 Boxplots of the (a) root mean square error, (b) mean absolute error, and (c) median absolute error values computed in Application 2.

FIGURE 2.19 Maps of rankings of (a) seasonal naı¨ve, (b) seasonal exponential smoothing (ETS) (no trend), (c) seasonal ETS (non-damped trend), and (d) seasonal ETS (damped trend) methods in terms of the root mean square error (RMSE) metric in Application 2. The methods are ranked from best (first) to worst (fourth).

FIGURE 2.20 Same as Fig. 2.19, but for the mean absolute error metric in Application 2.

Streamflow forecasting at large time scales using statistical models Chapter | 2

77

FIGURE 2.21 Same as Fig. 2.19, but for the median absolute error metric in Application 2.

Lastly, the mean relative improvements in terms of RMSE, MAE, and MdAE of the seasonal ETS models are presented with respect to the naı¨ve benchmark (Fig. 2.25). All the mean relative improvements in terms of RMSE and MAE are positive, whereas all the mean relative improvements in terms of MdAE are negative. The largest mean relative improvements are introduced by seasonal ETS (non-damped trend) model; nevertheless, the performance differences presented in Fig. 2.25 between the models are very small.

2.5 Conclusions Since their introduction, almost 100 years ago, statistical schemes have been the principal methods for univariate time series forecasting. Their recent development, both theoretical and practical (the latter through large-scale studies), has been enormous. In hydrology, statistical methods developed 40 years ago have been extensively implemented in large-scale experiments; however, recent developments have remained mostly unexploited, during a period when an explosively increasing number of case studies claim the superiority of artificial intelligence models in time series forecasting of geophysical processes at large time scales and hydrological processes at small time scales. However, recent large-scale studies based on the development of

78 Advances in Streamflow Forecasting

FIGURE 2.22 Number of times that each method was ranked from best (first) to worst (fourth) in Application two in terms of (A) root mean square error, (B) mean absolute error, and (C) median absolute error metrics.

Streamflow forecasting at large time scales using statistical models Chapter | 2

79

FIGURE 2.23 Mean rankings of the methods in Application 2 in terms of root mean square error (RMSE), mean absolute error (MAE), and median absolute error (MdAE) metrics.

FIGURE 2.24 Boxplots of the relative improvements of the seasonal exponential smoothing models with respect to the seasonal naı¨ve benchmark in Application 2.

80 Advances in Streamflow Forecasting

FIGURE 2.25 Mean relative improvements of the seasonal exponential smoothing models with respect to the seasonal naı¨ve benchmark in Application 2.

automated statistical methods have shown that forecasting using statistical methods can be competitive. This chapter aimed to reestablish statistical methods for hydrological time series forecasting, through presentation of their history and two large-scale applications.

Conflicts of interest We declare no conflict of interest.

Acknowledgment We thank the Editors of the book and the reviewers whose comments helped us to improve the manuscript.

References Akaike, H., 1974. A new look at the statistical model identification. IEEE Trans. Automat. Contr. 19 (6), 716e723. https://doi.org/10.1109/TAC.1974.1100705. Allaire, J.J., Xie, Y., McPherson, J., Luraschi, J., Ushey, K., Atkins, A., Wickham, H., Cheng, J., Chang, W., Iannone, R., 2019. Rmarkdown: Dynamic Documents for R. R Package Version 1.15. https://CRAN.R-project.org/package¼rmarkdown. Armstrong, J.S., 2001. Principles of Forecasting. Springer US. https://doi.org/10.1007/978-0-30647630-3.

Streamflow forecasting at large time scales using statistical models Chapter | 2

81

Bartlett, M.S., 1946. On the theoretical specification and sampling properties of autocorrelated time-series. J. Roy. Stat. Soc. Suppl. 8 (1), 27e41. https://doi.org/10.2307/2983611. Beaumont, C., 1979. Stochastic models in hydrology. Prog. Phys. Geogr. Earth Environ. 3 (3), 363e391. https://doi.org/10.1177/030913337900300303. Bengtsson, H., 2019. matrixStats: Functions that Apply to Rows and Columns of Matrices (and to Vectors). R Package Version 0.55.0. https://CRAN.R-project.org/package¼matrixStats. Beran, J., Feng, Y., Ghosh, S., Kulik, R., 2013. Long-Memory Processes. Springer-Verlag Berlin Heidelberg. https://doi.org/10.1007/978-3-642-35512-7. Blo¨schl, G., Bierkens, M.F.P., Chambel, A., et al., 2019. Twenty-three unsolved problems in hydrology (UPH) e a community perspective. Hydrol. Sci. J. 64 (10), 1141e1158. https:// doi.org/10.1080/02626667.2019.1620507. Brownrigg, R., Minka, T.P., Deckmyn, A., 2018. Maps: Draw Geographical Maps. R Package Version 3.3.0. https://CRAN.R-project.org/package¼maps. Box, G.E.P., Cox, D.R., 1964. An analysis of transformations. J. Roy. Stat. Soc. B 26 (2), 211e243. https://doi.org/10.1111/j.2517-6161.1964.tb00553.x. Box, G.E.P., Jenkins, G.M., 1970. Time Series Analysis: Forecasting and Control. Holden-Day Inc., San Francisco, USA. Box, G.E.P., Jenkins, G.M., Reinsel, G.C., Ljung, G.M., 2015. Time Series Analysis: Forecasting and Control. John Wiley & Sons, Inc., Hoboken, New Jersey. ISBN 978-1-118-67502-1. Brockwell, P.J., Davis, R.A., 1991. Time Series: Theory and Methods. Springer-Verlag New York. https://doi.org/10.1007/978-1-4419-0320-4. Brockwell, P.J., Davis, R.A., 2016. Introduction to Time Series and Forecasting. Springer International Publishing. https://doi.org/10.1007/978-3-319-29854-2. Brown, R.G., 1959. Statistical Forecasting for Inventory Control. McGraw-Hill Book Co., New York, USA. Canova, F., Hansen, B.E., 1995. Are seasonal patterns constant over time? A test for seasonal stability. J. Bus. Econ. Stat. 13 (3), 237e252. https://doi.org/10.1080/07350015.1995.105 24598. Carlson, R.F., MacCormick, A.J.A., Watts, D.G., 1970. Application of linear random models to four annual streamflow series. Water Resour. Res. 6 (4), 1070e1078. https://doi.org/10.1029/ WR006i004p01070. Chatfield, C., 1988. The future of the time-series forecasting. Int. J. Forecast. 4 (3), 411e419. https://doi.org/10.1016/0169-2070(88)90108-2. Cleveland, R.B., Cleveland, W.S., McRae, J.E., Terpenning, I., 1990. STL: a seasonal-trend decomposition procedure based on loess. J. Off. Stat. 6 (1), 3e33. Cox, D.R., 1991. Long-range dependence, non-linearity and time irreversibility. J. Time Anal. 12 (4), 329e335. https://doi.org/10.1111/j.1467-9892.1991.tb00087.x. Dagum, E.B., Bianconcini, S., 2016. Seasonal Adjustment Methods and Real Time Trend-Cycle Estimation. Springer International Publishing. https://doi.org/10.1007/978-3-319-31822-6. De Gooijer, J.G., 2017. Elements of Nonlinear Time Series Analysis and Forecasting. Springer International Publishing. https://doi.org/10.1007/978-3-319-43252-6. De Gooijer, J.G., Hyndman, R.J., 2006. 25 years of time series forecasting. Int. J. Forecast. 22 (3), 443e473. https://doi.org/10.1016/j.ijforecast.2006.01.001. De Gooijer, J.G., Abraham, B., Gould, A., Robinson, L., 1985. Methods for determining the order of an autoregressive-moving average process: a survey. Int. Stat. Rev. 53 (3), 301e329. https:// doi.org/10.2307/1402894. De Livera, A.M., Hyndman, R.J., Snyder, R.D., 2011. Forecasting time series with complex seasonal patterns using exponential smoothing. J. Am. Stat. Assoc. 106 (496), 1513e1527. https://doi.org/10.1198/jasa.2011.tm09771.

82 Advances in Streamflow Forecasting Dyer, T.G.J., 1977. On the application of some stochastic models to precipitation forecasting. Q. J. R. Meteorol. Soc. 103 (435), 177e189. https://doi.org/10.1002/qj.49710343512. Franses, P.H., Ooms, M., 1997. A periodic long-memory model for quarterly UK inflation. Int. J. Forecast. 13 (1), 117e126. https://doi.org/10.1016/S0169-2070(96)00715-7. Gardner Jr., E.S., 1985. Exponential smoothing: the state of the art. J. Forecast. 4 (1), 1e28. https:// doi.org/10.1002/for.3980040103. Gardner Jr., E.S., 2006. Exponential smoothing: the state of the artdPart II. Int. J. Forecast. 22 (4), 637e666. https://doi.org/10.1016/j.ijforecast.2006.03.005. Gardner Jr., E.S., McKenzie, E.D., 1985. Forecasting trends in time series. Manag. Sci. 31 (10), 1237e1246. https://doi.org/10.1287/mnsc.31.10.1237. Granger, C.W.J., 1978. New classes of time series models. J. Roy. Stat. Soc. 27 (3e4), 237e253. https://doi.org/10.2307/2988186. Granger, C.W.J., Joyeux, R., 1980. An introduction to long-memory time series models and fractional differencing. J. Time Anal. 1 (1), 15e29. https://doi.org/10.1111/j.1467-9892. 1980.tb00297.x. Graves, T., Gramacy, R., Watkins, N., Franzke, C., 2017. A brief history of long memory: Hurst, Mandelbrot and the road to ARFIMA, 1951e1980. Entropy 19 (9), 437. https://doi.org/ 10.3390/e19090437. Hamner, B., Frasco, M., 2018. Metrics: Evaluation Metrics for Machine Learning. R Package Version 0.1.4. https://CRAN.R-project.org/package¼Metrics. Hipel, K.W., McLeod, A.I., 1994. Time Series Modelling of Water Resources and Environmental Systems. Elsevier. ISBN 978-0-444-89270-6. Hipel, K.W., McLeod, A.I., Lennox, W.C., 1977a. Advances in box-jenkins modeling: 1. Model construction. Water Resour. Res. 13 (3), 567e575. https://doi.org/10.1029/WR013i003 p00567. Hipel, K.W., McLeod, A.I., Lennox, W.C., 1977b. Advances in box-jenkins modeling: 2. Applications. Water Resour. Res. 13 (3), 577e586. https://doi.org/10.1029/WR013i003p00577. Hosking, J.M.R., 1981. Fractional differencing. Biometrika 68 (1), 165e176. https://doi.org/ 10.1093/biomet/68.1.165. Holt, C.C., 2004. Forecasting seasonals and trends by exponentially weighted moving averages. Int. J. Forecast. 20 (1), 5e10. https://doi.org/10.1016/j.ijforecast.2003.09.015. Hong, W.C., Pai, P.F., 2007. Potential assessment of the support vector regression technique in rainfall forecasting. Water Resour. Manag. 21 (2), 495e513. https://doi.org/10.1007/s11269006-9026-2. Hurst, H.E., 1951. Long term storage capacity of reservoirs. Trans. Am. Soc. Civ. Eng. 116, 770e808. Hurst, H.E., 1956. The problem of long-term storage in reservoirs. International Association of Scientific Hydrology. Bulletin 1 (3), 13e27. https://doi.org/10.1080/02626665609493644. Hyndman, R.J., 2020. A brief history of forecasting competitions. Int. J. Forecast. 36 (1), 7e14. https://doi.org/10.1016/j.ijforecast.2019.03.015. Hyndman, R.J., Athanasopoulos, G., 2018. Forecasting: Principles and Practice. https://otexts.com/ fpp2/. Hyndman, R.J., Athanasopoulos, G., Bergmeir, C., Caceres, G., Chhay, L., O’Hara-Wild, M., Petropoulos, F., Razbash, S., Wang, E., Yasmeen, F., 2019. Forecast: Forecasting Functions for Time Series and Linear Models. R Package Version 8.9. https://CRAN.R-project.org/ package¼forecast. Hyndman, R.J., Khandakar, Y., 2008. Automatic time series forecasting: the forecast package for R. J. Stat. Softw. 27 (3) https://doi.org/10.18637/jss.v027.i03.

Streamflow forecasting at large time scales using statistical models Chapter | 2

83

Hyndman, R.J., Koehler, A.B., 2006. Another look at measures of forecast accuracy. Int. J. Forecast. 22 (4), 679e688. https://doi.org/10.1016/j.ijforecast.2006.03.001. Hyndman, R.J., Koehler, A.B., Snyder, R.D., Grose, S., 2002. A state space framework for automatic forecasting using exponential smoothing methods. Int. J. Forecast. 18 (3), 439e454. https://doi.org/10.1016/S0169-2070(01)00110-8. Hyndman, R.J., Koehler, A.B., Ord, J.K., Snyder, R.D., 2008. Forecasting with Exponential Smoothing. Springer-Verlag Berlin Heidelberg. https://doi.org/10.1007/978-3-540-71918-2. Kazmann, R.G., 1964. New problems in hydrology. J. Hydrol. 2 (2), 92e100. https://doi.org/ 10.1016/0022-1694(64)90020-4. Klemes, V., 1974. The Hurst phenomenon: a puzzle? Water Resour. Res. 10 (4), 675e688. https:// doi.org/10.1029/WR010i004p00675. Kolmogorov, A.N., 1940. Wienersche Spiralen und einige andere interessante Kurven im Hilbertschen Raum. Comptes Rendus (Doklady) Acad. Sci. USSR (N.S.) 26, 115e118. Ledolter, J., 1978. A general class of stochastic models for hydrologic sequences. J. Hydrol. 36 (3e4), 309e325. https://doi.org/10.1016/0022-1694(78)90151-8. Liu, L.M., 1989. Identification of seasonal arima models using a filtering method. Commun. Stat. Theor. Methods 18 (6), 2279e2288. https://doi.org/10.1080/03610928908830035. Machiwal, D., Jha, M.K., 2012. Hydrologic Time Series Analysis: Theory and Practice. Capital Publishing Company, New Delhi, India, p. 303. Springer, The Netherlands. https://doi.org/10. 1007/978-94-007-1861-6. Mandelbrot, B.B., Van Ness, J.W., 1968. Fractional Brownian motions, fractional noises and applications. SIAM Rev. 10 (4), 422e437. https://doi.org/10.1137/1010093. Mandelbrot, B.B., Wallis, J.R., 1968. Noah, Joseph, and operational hydrology. Water Resour. Res. 4 (5), 909e918. https://doi.org/10.1029/WR004i005p00909. Matalas, N.C., 1967. Time series analysis. Water Resour. Res. 3 (3), 817e829. https://doi.org/ 10.1029/WR003i003p00817. McLeod, A.I., Hipel, K.W., 1978a. Preservation of the rescaled adjusted range: 1. A reassessment of the Hurst phenomenon. Water Resour. Res. 14 (3), 491e508. https://doi.org/10.1029/WR0 14i003p00491. McLeod, A.I., Hipel, K.W., 1978b. Preservation of the rescaled adjusted range: 2. Simulation studies using Box-Jenkins Models. Water Resour. Res. 14 (3), 509e516. https://doi.org/ 10.1029/WR014i003p00509. McLeod, A.I., Hipel, K.W., 1978c. Preservation of the rescaled adjusted range: 3. Fractional Gaussian noise algorithms. Water Resour. Res. 14 (3), 517e518. https://doi.org/10.1029/WR0 14i003p00517. Murat, M., Malinowska, I., Hoffmann, H., Baranowski, P., 2016. Statistical modelling of agrometeorological time series by exponential smoothing. Int. Agrophys. 30 (1), 57e65. https:// doi.org/10.1515/intag-2015-0076. Noakes, D.J., McLeod, A.I., Hipel, K.W., 1985. Forecasting monthly riverflow time series. Int. J. Forecast. 1 (2), 179e190. https://doi.org/10.1016/0169-2070(85)90022-6. Noakes, D.J., Hipel, K.W., McLeod, A.I., Jimenez, C., Yakowitz, S., 1988. Forecasting annual geophysical time series. Int. J. Forecast. 4 (1), 103e115. https://doi.org/10.1016/01692070(88)90012-X. Ord, K., Lowe, S., 1996. Automatic forecasting. Am. Statistician 50 (1), 88e94. https://doi.org/ 10.1080/00031305.1996.10473549. Ord, J.K., Koehler, A.B., Snyder, R.D., 1997. Estimation and prediction for a class of dynamic nonlinear statistical models. J. Am. Stat. Assoc. 92 (440), 1621e1629. https://doi.org/10.1080/ 01621459.1997.10473684.

84 Advances in Streamflow Forecasting Papacharalampous, G., Tyralis, H., 2018. Evaluation of random forests and Prophet for daily streamflow forecasting. Adv. Geosci. 45, 201e208. https://doi.org/10.5194/adgeo-45-2012018. Papacharalampous, G., Tyralis, H., Koutsoyiannis, D., 2018a. One-step ahead forecasting of geophysical processes within a purely statistical framework. Geosci. Lett. 5 (12) https:// doi.org/10.1186/s40562-018-0111-1. Papacharalampous, G., Tyralis, H., Koutsoyiannis, D., 2018b. Predictability of monthly temperature and precipitation using automatic time series forecasting methods. Acta Geophysica 66 (4), 807e831. https://doi.org/10.1007/s11600-018-0120-7. Papacharalampous, G., Tyralis, H., Koutsoyiannis, D., 2018c. Univariate time series forecasting of temperature and precipitation with a focus on machine learning algorithms: a multiple-case study from Greece. Water Resour. Manag. 32 (15), 5207e5239. https://doi.org/10.1007/ s11269-018-2155-6. Papacharalampous, G., Tyralis, H., Koutsoyiannis, D., 2019a. Comparison of stochastic and machine learning methods for multi-step ahead forecasting of hydrological processes. Stoch. Environ. Res. Risk Assess. 33 (2), 481e514. https://doi.org/10.1007/s00477-018-1638-6. Papacharalampous, G., Tyralis, H., Langousis, A., Jayawardena, A.W., Sivakumar, B., Mamassis, N., Montanari, A., Koutsoyiannis, D., 2019b. Probabilistic hydrological postprocessing at scale: why and how to apply machine-learning quantile regression algorithms. Water 11 (10), 2126. https://doi.org/10.3390/w11102126. Papacharalampous, G., Koutsoyiannis, D., Montanari, A., 2020a. Quantification of predictive uncertainty in hydrological modelling by harnessing the wisdom of the crowd: methodology development and investigation using toy models. Adv. Water Resour. 136, 103471. https:// doi.org/10.1016/j.advwatres.2019.103471. Papacharalampous, G., Tyralis, H., Koutsoyiannis, D., Montanari, A., 2020b. Quantification of predictive uncertainty in hydrological modelling by harnessing the wisdom of the crowd: a large-sample experiment at monthly time scale. Adv. Water Resour. 136, 103470. https:// doi.org/10.1016/j.advwatres.2019.103470. Pegels, C.C., 1969. Exponential forecasting: some new variations. Manag. Sci. 15 (5), 311e315. https://doi.org/10.1287/mnsc.15.5.311. Peiris, M.S., Perera, B.J.C., 1988. On prediction with fractionally differenced ARIMA models. J. Time Anal. 9 (3), 215e220. https://doi.org/10.1111/j.1467-9892.1988.tb00465.x. R Core Team, 2019. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/. Schaake, J., Cong, S., Duan, Q., 2006. US MOPEX Data Set. IAHS Publication 307, pp. 9e28. Scheidegger, A.E., 1970. Stochastic models in hydrology. Water Resour. Res. 6 (3), 750e755. https://doi.org/10.1029/WR006i003p00750. Taylor, J.W., 2003. Exponential smoothing with a damped multiplicative trend. Int. J. Forecast. 19 (4), 715e725. https://doi.org/10.1016/S0169-2070(03)00003-7. Theodosiou, M., 2011. Forecasting monthly and quarterly time series using STL decomposition. Int. J. Forecast. 27 (4), 1178e1195. https://doi.org/10.1016/j.ijforecast.2010.11.002. Tyralis, H., 2016. HKprocess: Hurst-Kolmogorov Process. R Package Version 0.0-2. In: https:// CRAN.R-project.org/package¼HKprocess. Tyralis, H., Koutsoyiannis, D., 2011. Simultaneous estimation of the parameters of the HurstKolmogorov stochastic process. Stoch. Environ. Res. Risk Assess. 25 (1), 21e33. https:// doi.org/10.1007/s00477-010-0408-x.

Streamflow forecasting at large time scales using statistical models Chapter | 2

85

Tyralis, H., Koutsoyiannis, D., 2014. A Bayesian statistical model for deriving the predictive distribution of hydroclimatic variables. Clim. Dynam. 42 (11e12), 2867e2883. https:// doi.org/10.1007/s00382-013-1804-y. Tyralis, H., Papacharalampous, G., 2017. Variable selection in time series forecasting using random forests. Algorithms 10 (4), 114. https://doi.org/10.3390/a10040114. Tyralis, H., Papacharalampous, G., 2018. Large-scale assessment of Prophet for multi-step ahead forecasting of monthly streamflow. Adv. Geosci. 45, 147e153. https://doi.org/10.5194/adgeo45-147-2018. Tyralis, H., Papacharalampous, G., Burnetas, A., Langousis, A., 2019a. Hydrological postprocessing using stacked generalization of quantile regression algorithms: large-scale application over CONUS. J. Hydrol. 577, 123957. https://doi.org/10.1016/j.jhydrol.2019.123957. Tyralis, H., Papacharalampous, G., Langousis, A., 2019b. A brief review of random forests for water scientists and practitioners and their recent history in water resources. Water 11 (5), 910. https://doi.org/10.3390/w11050910. Vafakhah, M., Majdar, H.A., Eslamian, S., 2017. Rainfall prediction using time series analysis. In: Eslamian, S., Eslamian, F.A. (Eds.), Handbook of Drought and Water Scarcity, vol. 1, pp. 517e539. Veneziano, D., Langousis, A., 2010. Scaling and fractals in hydrology. In: Sivakumar, B., Berndtsson, R. (Eds.), Advances in Data-Based Approaches for Hydrologic Modeling and Forecasting. World Scientific, Singapore, pp. 107e243. https://doi.org/10.1142/9789814 307987_0004. Vitolo, C., 2017. hddtools: hydrological data discovery tools. J. Open Source Softw. 2 (9) https:// doi.org/10.21105/joss.00056. Vitolo, C., 2018. hddtools: Hydrological Data Discovery Tools. R Package Version 0.8.2. https:// CRAN.R-project.org/package¼hddtools. Warnes, G.R., Bolker, B., Gorjanc, G., Grothendieck, G., Korosec, A., Lumley, T., MacQueen, D., Magnusson, A., Rogers, J., 2017. Gdata: Various R Programming Tools for Data Manipulation. R Package Version 2.18.0. https://CRAN.R-project.org/package¼gdata. Wickham, H., 2007. Reshaping data with the reshape package. J. Stat. Softw. 21 (12), 1e20. https://doi.org/10.18637/jss.v021.i12. Wickham, H., 2016. ggplot2. Springer International Publishing. https://doi.org/10.1007/978-3-31924277-4. Wickham, H., 2017. reshape2: Flexibly Reshape Data: A Reboot of the Reshape Package. R Package Version 1.4.3. https://CRAN.R-project.org/package¼reshape2. Wickham, H., Chang, W., Henry, L., Pedersen, T.L., Takahashi, K., Wilke, C., Woo, K., Yutani, H., 2019a. ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. R Package Version 3.2.1. https://CRAN.R-project.org/package¼ggplot2. Wickham, H., Franc¸ois, R., Henry, L., Mu¨ller, K., 2019b. dplyr: A Grammar of Data Manipulation. R Package Version 0.8.3. https://CRAN.R-project.org/package¼dplyr. Wickham, H., Hester, J., Chang, W., 2019c. devtools: Tools to Make Developing R Packages Easier. R Package Version 2.2.1. https://CRAN.R-project.org/package¼devtools. Winters, P.R., 1960. Forecasting sales by exponentially weighted moving averages. Manag. Forecast. 6 (3), 324e342. https://doi.org/10.1287/mnsc.6.3.324. Wold, H.O., 1938. A Study in the Analysis of Stationary Time Series. Almqvist and Wiksells Boktryekeri A.B., Uppsala, Sweden. Xie, Y., 2014. knitr: a comprehensive tool for reproducible research in R. In: Stodden, V., Leisch, F., Peng, R.D. (Eds.), Implementing Reproducible Computational Research. Chapman and Hall/CRC.

86 Advances in Streamflow Forecasting Xie, Y., 2015. Dynamic Documents with R and knitr, second ed. Chapman and Hall/CRC. Xie, Y., 2019. knitr: A General-Purpose Package for Dynamic Report Generation in R. R Package version 1.25. https://CRAN.R-project.org/package¼knitr. Yevjevich, V., 1968. Misconceptions in hydrology and their consequences. Water Resour. Res. 4 (2), 225e232. https://doi.org/10.1029/WR004i002p00225. Yevjevich, V., 1987. Stochastic models in hydrology. Stoch. Hydrol. Hydraul. 1 (1), 17e36. https:// doi.org/10.1007/BF01543907. Yule, G.U., 1927. On a method of investigating periodicities disturbed series, with special reference to Wolfer’s sunspot numbers. Philos. Trans. R. Soc. A 226 (636e646), 267e298. https://doi.org/10.1098/rsta.1927.0007. Zambrano-Bigiarini, M., 2017. hydroGOF: Goodness-Of-Fit Functions for Comparison of Simulated and Observed Hydrological Time Series. R Package Version 0.3-10. https://CRAN. R-project.org/package¼hydroGOF. Zeileis, A., Grothendieck, G., 2005. zoo: S3 infrastructure for regular and irregular time series. J. Stat. Softw. 14 (6), 1e27. https://doi.org/10.18637/jss.v014.i06. Zeileis, A., Grothendieck, G., Ryan, J.A., 2019. Zoo: S3 Infrastructure for Regular and Irregular Time Series (Z’s Ordered Observations). R Package Version 1, 8-6. https://CRAN.R-project. org/package¼zoo.

Chapter 3

Introduction of multiple/ multivariate linear and nonlinear time series models in forecasting streamflow process Farshad Fathian Department of Water Science and Engineering, Faculty of Agriculture, Vali-e-Asr University of Rafsanjan, Rafsanjan, Kerman Province, Iran

3.1 Introduction Streamflow is one of the most important hydrological processes, which is used as the basic information in many activities related to water resources planning and management all over the world (Fathian et al., 2018). The streamflow data are always used in drought and flood analyses, erosion and sediment control, utilization of dam reservoirs for various uses, and so on (Fathian et al., 2019a). The water demand for agricultural, drinking, and industrial sectors based on countries’ developmental planning is significantly increasing. Therefore, modeling and accurate estimation of streamflow along with introducing new approaches and improvement of available ones are of high importance and necessity (Mohammadi et al., 2006; Kis¸i and Shiri, 2011). Time series modeling has been theoretically developed as well as practically applied since the 1970s, and nowadays, it is treated as one of the most important tools in studying hydrological processes (Modarres and Ouarda, 2013a, 2013b). In general, time series data are modeled for generating, forecasting, and completing the hydrological data series as well as enabling managers and policymakers to make appropriate decisions (Salas et al., 1980). Various kinds of complex relationships have also been proposed for modeling hydrological variables such as conceptual models of rainfall-runoff, linear time series models of streamflow process, and so forth. There is complex spatial and temporal dynamics in the structure of hydroclimatic variables whose features have not yet been fully explored. Hence, linearity/nonlinearity of the hydrological processes such as streamflow has been considered as one of Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00008-1 Copyright © 2021 Elsevier Inc. All rights reserved.

87

88 Advances in Streamflow Forecasting

the features that need to be captured through more investigations (Kumar et al., 2005; Wang et al., 2008; Machado et al., 2011; Fu et al., 2013). Thus, in streamflow modeling, one of the main issues or the major concern is whether to use a linear or a nonlinear approach (Sivapalan et al., 2002; Modarres and Ouarda, 2013a). In general, time series models are well known due to the easiness of their implementation as the group of data-driven or mathematical models. They simulate hydrological processes based on the previously recorded observations of the processes itself (Modarres and Ouarda, 2013a; Karimi et al., 2015). Although the univariate linear time series models have been widely applied for different hydrologic time series (Modarres, 2007; Machiwal and Jha, 2008; Sarhadi et al., 2014; Fathian et al., 2016; Mehdizadeh et al., 2020), more recently, there is a growing interest in the application of “multiple/multivariate linear and nonlinear” (MLN) time series models in hydrology (Modarres and Ouarda, 2012). The MLN approaches have been further discussed and developed in fields such as statistics, econometrics, and mathematics so that when applying this kind of time series approach, the ones that are spatially and temporally dependent are jointly considered. The multiple/multivariate modeling procedure is more complex than the univariate one, especially when the number of intended time series is high. In this context, multivariate time series provides useful techniques for processing information embedded in multiple measurements that have temporal and cross-sectional dependence (Tsay, 2013). In this chapter, the use of the “multiple/multivariate linear” time series approaches is demonstrated to model conditional mean behavior (or the firstorder moment) of hydrologic time series. Moreover, vector autoregressive without/with exogenous variables (VAR/VARX) approach is used for forecasting of both streamflow and rainfall-runoff processes. This approach is usually inadequate for removing the nonlinear properties such as heteroscedasticity (time-varying variance) in hydrological data that are related to the conditional varianceecovariance structure (Liu et al., 2011; Fathian et al., 2018, 2019a; Fathian, 2019; Fathian and Vaheddoost, 2021). Hence, to capture conditional time-varying variance (or the second-order moment) that may exist in such processes, multiple/multivariate generalized autoregressive conditional heteroscedasticity (MGARCH) approach known as the “nonlinear” time series models is presented. These approaches, which are rarely applied for modeling hydrological processes, provide useful tools for processing information that is located in the multiple/multivariate measurements (multisite) having temporal and cross-sectional dependence (Fathian et al., 2018, 2019a). In this context, this chapter intends to describe the methodology of the MLN time series approaches in detail as well as to present two case studies carried out on the upstream watershed of Zarrineh Rood Dam as one of the largest subbasins of Urmia Lake Basin (ULB), northwestern Iran.

Introduction of multiple/multivariate linear Chapter | 3

89

3.1.1 Review of MLN time series models By reviewing various researches on data-driven models in hydrology, a few studies have been conducted on the use of MLN time series models to which are referred here. In addition, as mentioned in previous section, these models provide a better understanding of the dynamic link between multiple/multivariate hydrological data and capture more precisely their joint dynamics and nonlinearity in space and time (Fathian et al., 2018, 2019a). Referring to preliminary studies, Matalas (1967), Matalas and Wallis (1971), and others in the early 1970s suggested the application of some types of the MLN models in hydrology. Although multiple/multivariate hydrologic data are widely modeled using the mathematically-based methods such as artificial intelligence (Dawson and Wilby, 2001; Shu and Ouarda, 2008; Machado et al., 2011; Wang et al., 2015) and multivariate regression techniques (Wang et al., 2008; Hundecha et al., 2008; McIntyre and Al-Qurashi, 2009), the MLN models such as VAR/VARX, MGARCH, and their hybrids and derivations are rarely applied for the modeling of hydrologic processes. In the following, studies related to such models and their derivations are mentioned. Niedzielski (2007) utilized a multivariate autoregressive (MAR) model for regional scale rainfall-runoff process for the Odra River, Poland. The results showed that forecasts based on multivariate data on discharges using the MAR-based model are more accurate than the VAR-based univariate predictions for a year with a flood. Solari and van Gelder (2011) used the VAR and regime switching VAR models for the simulation of wave height, period and direction, and wind speed and direction. Their findings showed that the VAR models are able to capture main features of the original series, and regime switching VAR models improve some aspects of the simulations. Fathian et al. (2018) applied the VARX and MGARCH models in modeling mean and conditional heteroscedasticity of daily rainfall and runoff time series in the Zarrineh Rood Dam watershed, northwestern Iran. The evaluation criteria for the VARXeMGARCH model revealed the improvement of the VARX model performance. In the other study carried out by Fathian et al. (2019a), the VAR and MGARCH models were fitted to the twofold daily streamflow time series to model the mean and conditional variance of the used data in the Zarrineh Rood Dam watershed, northwestern Iran. The assessment criteria demonstrated that the use of the MGARCH model improves streamflow modeling efficiency by capturing the heteroscedasticity in the twofold residuals obtained from the VAR model for all experiments. Modarres and Ouarda (2013a) applied the MGARCH model for modeling rainfall-runoff relationship. They illustrated the novelty and usefulness of the MGARCH models in hydrology by showing the varianceecovariance structure and dynamic correlation in a rainfall-runoff process. In another study, Modarres and Ouarda (2014a) applied the MGARCH models in modeling the relationship between climatic oscillations and drought characteristics. The results based on

90 Advances in Streamflow Forecasting

the MGARCH model output showed that there is a low level of covariance interaction between atmospheric circulations and standardized precipitation index (SPI) time series for the two selected stations in Iran. Modarres and Ouarda (2014b) also used the MGARCH models in modeling the relationship between local daily temperature and general circulation models’ (GCM) predictors, respectively. The results of the MGARCH model revealed a timevarying conditional correlation between GCM predictors and temperature time series.

3.2 Methodology In this section, the methodology for building the VAR/VARX models and their hybrids with the MGARCH approach for streamflow/rainfall-runoff time series is explained. The flowchart (Fig. 3.1) provides more details concerning the different stages of the modeling procedure.

3.2.1 VAR/VARX model The VAR model is the most commonly used form of multiple linear time series approach. This model describes dependency and interdependency of the normalized data in time. The VAR model with order p is represented as VAR(p) and is given according to the following equation (Fathian et al., 2019a): Zt ¼ f0 þ

p X

fi Zti þ at

(3.1)

i¼1

where Zt is the multiple time series at time t (e.g., streamflow time series), f0 is a k-dimensional constant vector, fi are k  k matrices for i > 0, fp s 0, and at is a sequence of independent and identically distributed (i.i.d.) random vectors with mean zero and covariance matrix Sa . In hydrological applications, k can represent the number of gauging stations (e.g., hydrometric sites). To better understand Eq. (3.1), the following equation is used to show the bivariate VAR(1) model. Notice that the equation can be expanded for the k-dimensional series: z1;t ¼ f10 þ f1;11 z1;t1 þ f1;12 z2;t1 þ a1;t

(3.2)

z2;t ¼ f20 þ f1;21 z1;t1 þ f1;22 z2;t1 þ a2;t or it can be written equivalently as " # " # " f1;11 z1;t f10 ¼ þ f1;21 z2;t f20

f1;12 f1;22

#"

z1;t1 z2;t1

#

" þ

a1;t a2;t

# (3.3)

Introduction of multiple/multivariate linear Chapter | 3

Selection of the other order of the model

Residual vectors obtained from the fitted MGARCH model

Standardization procedure: Logarithmization & deseasonalization

Selection of the other order of the model

Evaluation of the best fitted VAR and VARMGARCH models

Diagnostic Checking

Residual vectors obtained from the fitted VAR model

Yes

End

Start

Daily multivariate rainfall & streamflow data set

Preprocessing data set

Is the fitted VAR model adequate?

Model Evaluation

Yes Modeling procedure using the VAR approach No

(b)

Is the fitted MGARCH model adequate?

Selection of the other order of the model

Modeling procedure using the MGARCH approach

Residual vectors obtained from the fitted MGARCH model

Model Specification & Estimation

Model Specification & Estimation

No Daily multiple standardized data set

Diagnostic Checking

Modeling procedure using the MGARCH approach

Diagnostic Checking

Model Specification & Estimation

Standardization procedure: Logarithmization & deseasonalization

Is the fitted MGARCH model adequate?

Daily multivariate standardized data set

Diagnostic Checking

Preprocessing data set

Daily multiple streamflow data set

Model Specification & Estimation

Start

Selection of the other order of the model

Modeling procedure using the VARX approach Evaluation of the best fitted VARX and VARXMGARCH models Residual vectors obtained from the fitted VARX model

Is the fitted VARX model adequate? End

FIGURE 3.1 Flowchart of the modeling (A) streamflow and (B) rainfall-runoff processes.

Model Evaluation

(a)

91

92 Advances in Streamflow Forecasting

From Eq. (3.3), one can find out that the (1,2)th element of f1 , i.e., f1;12 , indicates the linear dependence of z1;t on z2;t1 in the presence of z1;t1 . The other parameters can be interpreted in a similar way. Unlike the VAR model, which presents the time dependence and interdependence of the intended time series (Solari and van Gelder, 2011), the VARX model includes the VAR model with exogenous variables. The X in VARX represents the input covariate series such as rainfall. To further expand the concept of the model, consider Zt to be a k-dimensional time series and X t an m-dimensional series of exogenous variables. Then, the VARX(p,s) model is given as (Tsay, 2013): Zt ¼ f0 þ

p X

fi Zti þ

i¼1

s X

bj Xtj þ at

(3.4)

j¼0

where f i is a k  k matrix for i > 0, bj are k  m coefficient matrices, and at is a sequence of i.i.d. random vectors with mean zero and covariance matrix Sa , which is positive-definite, p and s are the orders of model parameters as nonnegative integers. The expansion of the bivariate VARX(1,0) model (i.e., p ¼ 1 and s ¼ 0), as an example, when we consider one exogenous variable (i.e., X variable) and two gauging stations (i.e., k ¼ 2) is as follows: z1;t ¼ f10 þ f1;11 z1;t1 þ f1;12 z2;t1 þ b0;11 x1;t þ b0;12 x2;t þ a1;t

(3.5)

z2;t ¼ f20 þ f1;21 z1;t1 þ f1;22 z2;t1 þ b0;21 x1;t þ b0;22 x2;t þ a2;t or it can be written equivalently as " # " # " #" # " f1;11 f1;12 b0;11 z1;t z1;t1 f10 ¼ þ þ f1;21 f1;22 b0;21 z2;t z2;t1 f20

b0;12 b0;22

#"

x1;t x2;t

#

" þ

a1;t

#

a2;t (3.6)

It should be noted that before applying the univariate/multivariate time series approaches for modeling the conditional mean behavior of the time series data, the original data are transformed into a deseasonalized time series to obtain the normal and nonseasonal time series using standardization technique (Fathian et al., 2019b). In this respect, two steps are considered. First, a new series is obtained by taking the logarithm of the daily dataset. Second, the new series is standardized by subtracting the seasonal mean values from the dataset and dividing the resulting values by their own seasonal standard deviation (Fathian et al., 2018). Herein, the whole original streamflow time series is standardized through the following equation (Salas et al., 1980):   Yt;s ¼ log Xt;s Zt;s ¼

Yt;s  m bs b ss

(3.7)

Introduction of multiple/multivariate linear Chapter | 3

93

where Xt;s , Yt;s , and Zt;s are the original, logarithmic, and standardized data at time t, respectively, s ¼ 1; .; s and s is the number of time intervals in the year, m b s and b s s are the seasonal mean and standard deviation, respectively (Fathian et al., 2019c).

3.2.2 Model building procedure In general, in the model building procedure, the iterated procedure of Box and Jenkins including model specification, estimation, and diagnostic checking is followed (Box et al., 2008). The orders of both models (VAR/VARX) are obtained by Akaike Information Criterion (AIC) proposed by Akaike (1973) for model specification. Moreover, the coefficients of both the models are estimated by the least squares method for model estimation, and those are asymptotically normally distributed (Tsay, 2013). The AIC function that is commonly used to determine the model order p is given as (Akaike, 1973):   2 AICðpÞ ¼ lnSa;p  þ pk2 T

(3.8)

where T is the sample size, Sa;l is the covariance matrix of the sequence of i.i.d. random vectors. To check the model adequacy of a multiple/multivariate linear time series approach, structures of the autocorrelation functions (ACFs) and crosscorrelation functions (CCFs) of the multiple residuals are first examined. It should be noted that for an i.i.d. series with length n, the lag k coefficients distribution of ACFs and CCFs are normal with a mean zero and a variance pffiffiffi 1/n, and a 95% confidence limit is given by 1:96= n. The adequacy of the model fitted to a multiple time series is not accepted when all coefficients obtained from the ACFs and CCFs do not fall within the confidence limits (Tsay, 2013). More formally, a multiple/multivariate LjungeBox test (Li, 2004) is also applied to check the adequacy of the VAR/VARX model. This test computes a statistic Qk ðmÞ, which is c2 distributed with ðm pÞk2 degrees of freedom and it is given as follows (Fathian et al., 2018):   m X 0 1 1 1 tr Rb k Rb0 Rbk Rb0 Qk ðmÞ ¼ T 2 (3.9) T k k¼1 where T is the sample size, m is the number of cross-correlation matrices of the multiple residuals, Rb0 and Rbk are the lag zero and k sample of cross-correlation matrix related to the multiple residuals, respectively, and the prime denotes the transpose of a matrix. The residuals are time-independent and the model is adequate if the statistic value Qk ðmÞ is higher than the critical value. For more details concerning the model building procedure, reader may refer to Box et al. (2008) and Tsay (2013).

94 Advances in Streamflow Forecasting

3.2.3 MGARCH model The MGARCH model is commonly applied to model time series jointly that have multiple measurements. This model identifies if there is a relationship between the conditional variance of multiple time series through time (Francq and Zakoian, 2011). The general MGARCH model with a k-dimensional zero mean, serially uncorrelated time series matrix at ¼ ða1t ; a2t ; .; akt Þ0 can be presented as follows:    Eðat jFt1 Þ ¼ 0; St ¼ Ht ¼ Covðat jFt1 Þ ¼ E at a0t Ft1 (3.10) Ft1 ¼ fat1 ; at2 ; .g

at jFt1 wð0; St Þ

1=2

at ¼ H t ε t

(3.11) (3.12)

where Ht is the conditional covariance matrix of multiple time series at , εt is a k-dimensional i.i.d. white noise, εt wi:i:d:ð0; Ik Þ, Ft1 denotes the s-field generated by the past data fzti ji ¼ 1; 2; .g. Further details about the MGARCH models are already well established in the literature (Tsay, 2010; Francq and Zakoian, 2011).

3.2.3.1 Diagonal VECH model The diagonal VECH model (DVECH hereafter), one of the main kinds of the MGARCH model, was developed by Bollerslev et al. (1988). The VECH term represents the half-vectorization operator stacking the column of a square matrix from the diagonal downwards in a vector. The DVECH model is given as (Tsay, 2013) Ht ¼ u þ

m X i¼1

s  X  Gi 4 ati a0ti þ Gj 4Htj

(3.13)

j¼1

where u is constant, Gi and Gj are symmetric matrices, m and s are nonnegative integers, and 4 denotes the Hadamard product, i.e., element-by-element multiplication. To better understand Eq. (3.13), a bivariate DVECH(1,1), as an example, is presented as 2 3 2 3 2 3 2 3 a21;t1 h11;t u11 g11 4 5¼4 5þ4 5 46 5 4 2 h21;t h22;t u21 u22 g21 g22 a1;t1 a2;t1 a2;t1 2 3 2 3 h11;t1 g11 5 44 5 þ4 h21;t1 h22;t1 g21 g22 (3.14) where only the lower triangular part of the model is given and the number of parameters in the diagonal VECH(1,1) equals 3ðkðk þ1Þ =2Þ.

Introduction of multiple/multivariate linear Chapter | 3

95

According to Eq. (3.14), there are nine parameters to estimate for the bivariate model. Here, h11;t and h22;t are the conditional variance of the streamflow time series at the first and second gauging stations, respectively. Moreover, h21;t indicates the conditional covariance or co-volatility between two time series (Tsay, 2010; Modarres and Ouarda, 2013a). For the bivariate case, Eq. (3.14) can also be presented as h11;t ¼ u11 þ g11 a21;t1 þ g11 h11;t1 h21;t ¼ u21 þ g21 a1;t1 a2;t1 þ g21 h21;t1 h22;t ¼ u22 þ

g22 a22;t1

(3.15)

þ g22 h22;t1

3.2.3.2 Testing conditional heteroscedasticity It is assumed that the residuals time series obtained from the VAR/VARX model, which are modeling the first-order moment of multiple hydrological data, are time-invariant. In other words, if there is no conditional heteroscedasticity behavior in the multiple residual time series at , its conditional covariance matrix St (the second-order moment or squared residuals) may be time-independent. This feature in these types of the time series is called an ARCH effect, and before modeling procedure, it needs to check the presence of conditional heteroscedasticity by Portmanteau test (Tsay, 2013). The LjungeBox statistic of this test Qk ðmÞ for fixed adequately m can be written as follows (Li, 2004):   m X 1 0 1 1  2 b b Qk ðmÞ ¼ T r 0 bi (3.16) r 5b T i i 0 i¼1 0

r i Þ with b ri where T is the sample size, k is the dimension of at , and bi ¼ vecðb being the lag-i sample cross-correlation matrix of the squared residuals a2t . Under the null hypothesis that at has no conditional heteroscedasticity,  Qk ðmÞ is asymptotically distributed as c2mk2 , that is, a chi-square distribution with mk2 degrees of freedom (Li, 2004). The Qk ðmÞ statistic is asymptotically equivalent to the multivariate generalization of the Lagrange multiplier test proposed by Engle (1982).

3.2.4 Case study In this chapter, application of the VAR/VARX and MGARCH methods, which were described in the previous section, for modeling multiple streamflow time series as well as multivariate rainfall-runoff process is demonstrated through a case study. For this purpose, the upstream subbasin of the Zarrineh Rood Dam located in the southern part of the ULB, northwestern Iran, was selected (Fig. 3.2). Besides, the daily average streamflow data of six hydrometric

96 Advances in Streamflow Forecasting

FIGURE 3.2 Geographical positions of the rivers and concerned streamflow and rainfall gauging stations in the study area (Fathian et al., 2018).

stations and daily rainfall data of 20 gauge stations were used for 1997e2011 period (Fathian et al., 2018, 2019a). The study area is composed of four river subbasins, namely, Saghez Chai (including “GH” and “D” stations), Jighato Chai (including “PG” and “PA” stations), Khorkhoreh Chai (including “S” station), and Sarogh Chai (including “SK” station) from the west to east, respectively, discharging water into the Zarrineh Rood Dam’s reservoir. In addition, using Thiessen polygon method, the weighted daily average rainfall data were computed for each corresponding streamflow gauging station (Fathian et al., 2019a, 2019b).

Introduction of multiple/multivariate linear Chapter | 3

97

In order to apply the models mentioned above, the number of stations (k-dimension) should be chosen within the study area. Different experiments were selected to assess the performances of the mentioned approaches, including three distinct bivariate experiments that are a combination of both upstream and downstream stations (Table 3.1). In other words, the three bivariate VAR(p)/VARX(p) and DVECH(m,s) models were used so that each experiment consists of two stations (k ¼ 2).

3.3 Application of VAR/VARX approach 3.3.1 The VAR model For fitting a VAR model for the defined experiments, in the first step, the fitted model’s order was determined. Hence, the streamflows’ ACF, CCF of two selected stations for each experiment, and the calculated AIC value were checked. The experiment 1, as an example, was applied to illustrate the properties of the two streamflows data related to both upstream (“GH” site) and downstream (“D” site) stations’ streamflow time series. Fig. 3.3 shows the time series plot of the two original streamflows and their corresponding standardized time series. Fig. 3.3A and B demonstrates the annual seasonality; however, any trend and seasonality was not observed in the deseasonalized time series at both stations (Fig. 3.3C and D). The ACF and CCF graphs of the two stations data for experiment 1 showed a high persistence and nonstationarity behavior in both time series (Fig. 3.4AeC). The persistence in both time series was reduced using the standardization technique (Fig. 3.4DeF). Furthermore, the standardized series presented a fast decaying ACF and CCF. The best-fitted model is selected by trial and error procedure for each experiment based on both the smallest value of AIC and testing the multiple residuals of the fitted model for checking model adequacy. Thus, the VAR models with orders 5, 9, and 12 are fitted to the daily streamflows for experiments 1e3, respectively. For experiment 1, as an example, the properties and the adequacy of the VAR(5) model fitted to the bivariate time series are shown in Table 3.2 and Fig. 3.5A, respectively. According to the multiple

TABLE 3.1 List of experiments defined in the present study (Fathian et al., 2018, 2019a). Experiments

Selected stations

Location

Experiment 1

“GH”/“D”

Upstream/downstream

Experiment 2

“PG”/“PA”

Upstream/downstream

Experiment 3

“S”/“SK”

Downstream/downstream

98 Advances in Streamflow Forecasting

FIGURE 3.3 Daily streamflow (m3/s) time series plots relating to (A) “GH,” (B) “D,” and (C, D) their corresponding standardized time series (Fathian et al., 2019a).

FIGURE 3.4 Autocorrelation function and cross-correlation function graphs for the daily streamflow time series at “GH” and “D” stations; (AeC) for the original data and (DeF) for the standardized data (Fathian et al., 2019a).

Introduction of multiple/multivariate linear Chapter | 3

99

TABLE 3.2 Properties of the VAR(5) model fitted to the twofold streamflow time series for experiment 1. Coefficient f1

Estimated parameters’ matrix

Standard error matrix

"

"

0:772 0:145

#

0:036 0:928 f2

"

0:000

f3

0:088

0:161 0:070 0:000

f4

"

"

f5

"

#

"

"

0:029 0:000

0:014 0:000

#

#

(b)"GH"

Auto-correlation function (ACF)

P-value

0:019 0:018

Lag (day) (d)"GH"&"D"

Cross-correlation function (CCF)

(c)"D"

Auto-correlation function (ACF)

#

0:009 0:000

Lag (day)

Lag (day)

0:017 0:024

0:000 0:009

#

(a)"GH"&"D"

#

0:000 0:000

0:030

0:064 0:000

0:000 0:022 0:013 0:000

#

0:000

0:054 0:037 0:000

"

#

0:013 0:010 #

0:059 0:000 "

0:012 0:017

Lag (day)

FIGURE 3.5 (A) P-values of the multiple LjungeBox test, (BeC) autocorrelation functions, and (D) cross-correlation function of the bivariate residuals for the VAR(5) model at “GH” and “D” stations (Fathian et al., 2019a).

100 Advances in Streamflow Forecasting

LjungeBox test, all p-values are above the critical level, a ¼ 0.05. Fig. 3.5BeD also presents that there is no auto- and cross-correlation structures in the multiple residuals obtained from the VAR(5) model as seen in the ACF and CCF graphs (i.e., all coefficients are within the confidence limits).

3.3.2 The VARX model The first step before building a VARX model is to identify whether a causal relationship exists or not between rainfall and runoff time series. Accordingly, the CCF was plotted to explore the significant correlation between the residuals of the rainfall time series (as an exogenous variable) and streamflow time series (as endogenous variable) obtained from the autoregressive moving average models fitted to each variable separately. The results of the residual CCF graphs for all stations at different lag times are shown in Fig. 3.6. It is seen that a significant correlation exists between rainfall and streamflow for each station, though not very strong, at lag times k ¼ 1e3 (i.e., the fitted model may involve daily rainfall from today to 2 days ago, time t, t  1, and t  2). To build the VARX model, the order of the VAR part of the model was determined by examining ACF and CCF between rainfall and runoff as well as the AIC value in streamflow time series. Therefore, the VAR model was fitted to twofold standardized time series for each experiment using trial and error procedure. At the end, the best-fitted model among different candidate models was chosen based on both the minimum AIC and testing the multiple residuals of the model for checking the model adequacy. Regarding the best-fitted models, VARX(5,1), VARX(10,2), and VARX(13,2) were fitted to the daily rainfall-runoff process for experiments 1e3, respectively. Table 3.3 presents the properties of the VARX(5,1) model fitted to the rainfall-runoff process for experiment 1 as an example. More formally, for checking the adequacy of the best-fitted models to bivariate time series, Fig. 3.7 demonstrates the p-values of the multivariate LjungeBox test that are more than the critical values (a ¼ 0.05) for experiment 1. Moreover, the ACFs and CCF in the residuals are not found significant, and hence, all coefficients are within the confidence limits.

3.4 Application of MGARCH approach According to the ACFs and CCF of the multiple residuals obtained from the VAR/VARX, correlation coefficients are statistically not significant. Thus, the ACFs in the squared multiple residuals may show the presence/absence of the ARCH effect (conditional heteroscedasticity). The ACF and CCF graphs of the squared residual time series obtained from the multiple/multivariate linear time series for experiment 1 are shown in Fig. 3.8 for instance. Time-varying variance (conditional heteroscedasticity) is observed in the squared residuals

0

5

10

15

0.05

0.10

"D"

20

0

5

10

15

0.10 0.05

20

0

5

15

20

15

20

0.05

0.10

"SK"

0.00

Cross-correlation function (CCF)

0.10 0.05

Cross-correlation function (CCF)

0.00

10

Lag (day)

10

Lag (day) "S"

5

20

"PA"

Lag (day)

0

15

0.00

Cross-correlation function (CCF)

0.05

0.10

"PG"

5

10

Lag (day)

0.00

Cross-correlation function (CCF)

Lag (day)

0

101

0.00

Cross-correlation function (CCF)

0.05

0.10

"GH"

0.00

Cross-correlation function (CCF)

Introduction of multiple/multivariate linear Chapter | 3

0

5

10

15

20

Lag (day)

FIGURE 3.6 Cross-correlation functions between residual of rainfall and streamflow time series at all hydrometric stations (Fathian et al., 2018).

series as the ACFs and CCF exceed the confidence limits at different lag times. The graphs demonstrated the presence of the conditional heteroscedasticity (or the ARCH effect) in the squared residuals of the multiple/multivariate linear time series models. The results of the Portmanteau test also showed that the pvalues are less than the critical value, a ¼ 0.05 for all lags. Therefore, the null hypothesis of no heteroscedasticity effect is rejected, and it is suggested to remove the heteroscedasticity effect before using the MGARCH approach for each experiment.

Coefficient f1

Estimated parameters matrix

Standard error matrix

"

"

0:740 0:113

#

0:020 0:913 f2

f3

f4

"

"

"

#

f5

Estimated parameters matrix

Standard error matrix

"

#

"

0:080

0:025

0:000

0:150

0:056

0:025

0:000 #

#

"

#

"

#

"

0:019 0:023

#

b0

"

#

b1

0:019 0:018 0:000 0:010

"

0:002 0:000 0:006 0:000

0:014 0:000 "

0:024 0:000

0:014 0:000

0:001 0:000

e

#

0:001 0:000 #

"

0:001 0:000 0:001 0:000

# e

#

0:010 0:000

0:022 0:000

0:016 0:000 0:019 0:023

0:064 0:000 0:032 0:000

0:013 0:010

0:037

0:052 0:043 0:000 0:043

0:016 0:018

Coefficient

e

#

102 Advances in Streamflow Forecasting

TABLE 3.3 Properties of the VARX(5,1) model fitted to the bivariate rainfall-runoff process for experiment 1.

Introduction of multiple/multivariate linear Chapter | 3

103

Auto-correlation function (ACF)

1

(a)"GH"&"D"

P-value

0.8 0.6 0.4 0.2 0 0

5

10

15

20

2

Lag (day)

(c)"D"

Cross-correlation function (CCF)

Auto-correlation function (ACF)

Lag (day)

Lag (day)

(b)"GH"

(d)"GH"&"D"

Lag (day)

FIGURE 3.7 (A) P-values of the multivariate LjungeBox test, (BeC) autocorrelation functions, and (D) cross-correlation function of the twofold residuals for the VARX(5,1) model for experiment 1 (Fathian et al., 2018).

As mentioned in the methodology, the DVECH model as a type of MGARCH approach was used to capture the conditional varianceecovariance of the multiple residuals obtained from VAR/VARX model. Hence, the bivariate DVECH model was developed for modeling the conditional varianceecovariance structure in the multiple residuals of streamflow time series for each experiment. The best-fitted models were selected based on the minimum AIC by trying different orders of the bivariate DVECH model fitted to the daily twofold residuals time series for each experiment. Finally, the bivariate DVECH(1,1), DVECH(3,1), and DVECH(2,1) models were fitted to the multiple residual time series obtained from the VAR model in modeling procedure of streamflow process for experiments 1e3, respectively (Table 3.4). Moreover, the bivariate DVECH(1,1) model was fitted to the multiple residual time series obtained from the VARX model in modeling procedure of the rainfall-runoff process for all three experiments (Table 3.5). To control the adequacy of the DVECH model fitted to each experiment, the ACF of the squared standardized residual time series and its cross product (εt matrix series

(a)"GH"

Auto-correlation function (ACF)

Auto-correlation function (ACF)

104 Advances in Streamflow Forecasting

(b)"D"

Lag (day)

Lag (day)

Cross-correlation function (CCF)

0.2

(d)"GH"&"D"

(c)"GH"&"D"

P-value

.15

0.1

.05

0 0

Lag (day)

5

10

15

20

25

Lag (day)

FIGURE 3.8 (AeB) Autocorrelation functions, (C) Cross-correlation function of the squared twofold residuals obtained from the VAR(5)/VARX(5,1) model, and (D) P-values of the Portmanteau test for the squared twofold residuals at “GH” and “D” stations (Fathian et al., 2018, 2019a).

obtained from Eq. 3.12) are used according to suggestion provided by Tse (2002). The ACFs of the squared standardized residual time series for “GH” and “D” stations in modeling procedure of both streamflow and rainfall-runoff processes are shown as examples in Figs. 3.9 and 3.10, respectively. It is revealed that the ACFs are not significant, and hence, all coefficients at different lag times are within the confidence limits. Therefore, the null hypothesis of no ARCH effect could not be rejected.

3.5 Comparative evaluation of models’ performances In this section, comparative performance of the VAR/VARX model and its combination with the DVECH approach evaluated for modeling both twofold streamflow and rainfall-runoff processes is discussed for each experiment. Before discussing the evaluation criteria, time series plots of the observed and estimated values of both twofold streamflows and rainfall-runoff processes were drawn for experiment 1, as an example, in Figs. 3.11 and 3.12, respectively.

Experiment 1

u "

0:080

#

G1 "

0:035 0:069 2

"

0:034

3

0:024 0:011 0:095

G2

G3

e

e

G1 "

0:167 0:175 #

"

#

"

0:013 0:031 "

0:274

#

0:265

#

0:773 0:774 #

"

#

"

#

0:038 0:027 0:067

0:211 0:253 0:318

0:685

0:077 0:317

Note: u is the matrix of constants, and Gi and Gi are coefficient matrices.

0:106

"

0:863

#

0:894 0:879 "

e 0:135

#

0:095 0:088 #

0:133 0:005

"

0:814 0:828 0:739

#

Introduction of multiple/multivariate linear Chapter | 3

TABLE 3.4 Parameters of the bivariate DVECH models fitted to the multiple residuals obtained from the VAR model in modeling streamflow process for each experiment (Fathian et al., 2019a).

105

106 Advances in Streamflow Forecasting

TABLE 3.5 Parameters of the bivariate DVECH models fitted to the multiple residuals obtained from the VARX model in modeling rainfall-runoff process for each experiment (Fathian et al., 2018). Experiment 1

2

u "

"

0:078 0:030 0:088

#

#

0:070

G1 "

"

0:024 0:052 3

"

G2

G3

e

e

e

e

#

0:201

G1 "

"

0:151 0:164 #

0:045

0:273 0:162 0:196

#

"

0:014 0:144

#

#

0:752 0:806 0:795

#

0:246

0:690 0:764 0:734

" e

e

0:089 0:228

#

0:740 0:776 0:638

Note: u is the matrix of constants, and Gi and Gi are the coefficient matrices.

Lag (day)

(b)"D"

Auto-correlation function (ACF)

Auto-correlation function (ACF)

(a)"GH"

Lag (day)

Cross-correlation function (ACF)

(c)"GH"&"D"

Lag (day)

FIGURE 3.9 (AeB) Autocorrelation functions of the squared and (C) cross product standardized twofold residuals for the VAR(5)-DVECH(1,1) model at “GH” and “D” stations of experiment 1 (Fathian et al., 2019a).

(b)"D"

Auto-correlation function (ACF)

Auto-correlation function (ACF)

(a)"GH"

Lag (day)

Lag (day)

Cross-correlation function (ACF)

(c)"GH"&"D"

Lag (day)

FIGURE 3.10 (AeB) Autocorrelation functions of the squared and (C) cross product standardized twofold residuals for the VARX(5,1)-DVECH(1,1) model at “GH” and “D” stations of experiment 1 (Fathian et al., 2018). 250

250

200

(a) "GH"

Esmated Streamflow (m3/s)

Streamflow (m3/s)

Observed

150 100 50

200

0

150 100

0 1/1/1997 10/1/2000 7/1/2004 3/31/200812/30/2011

Time (day) Observed Esmated

Time (day) 250 (c) "D"

150

Streamflow (m3/s)

Streamflow (m3/s)

(b) "GH"

50

1/1/1997 10/1/2000 7/1/2004 3/31/200812/30/2011

200

Observed Esmated

100

50

200

Observed Esmated

(d) "D"

150 100 50

0

0

1/1/1997 10/1/2000 7/1/2004 3/31/200812/30/2011

1/1/1997 10/1/2000 7/1/2004 3/31/2008 12/30/2011

Time (day)

Time (day)

FIGURE 3.11 Time series plot of the observed and estimated streamflows for (A, C) the VAR and (B, D) the VAR-DVECH models for twofold streamflow process in experiment 1 (Fathian et al., 2019a).

108 Advances in Streamflow Forecasting

200

250 Observed Esmated

(a) "GH" Streamflow (m3/s)

Streamflow (m3/s)

250

150 100 50

(b) "GH"

150 100 50

0

0

1/1/1997 10/1/2000 7/1/2004 3/31/200812/30/2011

1/1/1997 10/1/2000 7/1/2004 3/31/200812/30/2011

Time (day)

Time (day)

200

Observed Esmated

(c) "D"

150

250 Streamflow (m3/s)

Streamflow (m3/s)

200

Observed Esmated

100

50

Observed Esmated

(d) "D"

200 150 100 50 0

0 1/1/1997 10/1/2000 7/1/2004 3/31/200812/30/2011

1/1/1997 10/1/2000 7/1/2004 3/31/200812/30/2011

Time (day)

Time (day)

FIGURE 3.12 Time series plot of the observed and estimated streamflows for (A, C) the VARX and (B, D) the VARX-DVECH models for twofold rainfall-runoff process in experiment 1 (Fathian et al., 2018).

The results showed that the performance of the VAR/VARX-DVECH model is better than that of the VAR/VARX model in estimating high magnitudes of streamflows (Fig. 3.11) and rainfall-runoff (Fig. 3.12) processes. However, both the models performed well in estimating low magnitudes of streamflows. It is further observed that the performance of the VAR/VARX and VAR/VARX-DVECH models is almost the same in case of the high streamflows for the “GH” station located in the upstream part of the subbasin. On the other hand, the fluctuations of high streamflow observations revealed the better performance of the VAR-DVECH approach compared to the VAR one for the “D” station located in the downstream part of the subbasin. Values of the performance evaluation criteria for all three experiments of the VAR/VARX and VAR/VARX-DVECH models are presented in Table 3.6. The numerical evaluation criteria were broadly used in the literature classified into three groups: absolute error, relative error, and dimensionless metrics (Dawson et al., 2007; Modarres and Ouarda, 2012, 2013a, 2013b). It is seen that the VAR/VARX-DVECH model outperforms the VAR/VARX model based on the statistical evaluation criteria adopted in this study. As a result, the VAR/VARX-DVECH model is adjudged as a “very good” model according to

Introduction of multiple/multivariate linear Chapter | 3

109

TABLE 3.6 Evaluation criteria of the VAR/VARX and VAR/VARX-DVECH models for all three defined experiments (Fathian et al., 2018, 2019a). Experiment/ Station

Model

AME (m3/s)

PDIFF (m3/s)

MAE (m3/s)

RMSE (m3/s)

RAE

R2

1/GH

VAR(5)

113.03

38.22

1.50

5.84

0.16

0.85

124.92

21.59

1.31

5.31

0.15

0.86

VAR(5)DVECH(1,1)

129.36

37.78

1.43

5.47

0.15

0.87

124.66

31.67

1.11

4.12

0.13

0.91

VAR(9)

145.00

24.76

1.91

7.55

0.15

0.88

164.58

24.98

2.21

8.39

0.16

0.88

VAR(9)DVECH(3,1)

157.95

16.03

1.78

6.85

0.14

0.90

155.87

19.13

1.92

6.77

0.13

0.92

VAR(12)

169.39

40.36

1.27

6.06

0.14

0.86

72.53

2.74

0.98

3.93

0.13

0.91

185.30

43.52

1.18

5.20

0.12

0.90

77.48

33.00

0.92

3.75

0.12

0.92

106.65

1.58

1.43

5.05

0.16

0.890

118.17

28.42

1.25

4.70

0.14

0.890

139.12

0.89

1.41

4.96

0.15

0.896

158.29

65.29

1.13

4.33

0.12

0.912

141.9

4.08

1.83

6.86

0.14

0.906

166.3

110.1

2.10

7.76

0.15

0.900

VARX(10,2)DVECH(1,1)

167.85

21.85

1.73

6.39

0.13

0.917

139.48

1.55

1.86

6.35

0.13

0.931

VARX(13,2)

166.0

92.91

1.29

6.18

0.13

0.863

70.90

14.17

0.93

3.41

0.12

0.934

165.6

43.68

1.20

5.14

0.12

0.901

77.07

34.16

0.87

3.25

0.11

0.940

1/D 1/GH 1/D 2/PG 2/PA 2/PG 2/PA 3/S 3/SK 3/S 3/SK 1/GH

VAR(12)DVECH(2,1) VARX(5,1)

1/D 1/GH 1/D 2/PG

VAR(5,1)DVECH(1,1) VARX(10,2)

2/PA 2/PG 2/PA 3/S 3/SK 3/S 3/SK

VARX(13,2)DVECH(1,1)

2

AME, absolute maximum error; MAE, mean absolute error; PDIFF, peak difference; R , coefficient of determination; RAE, relative absolute error; RMSE, root mean squared error. Bold values indicate the result is significant (i.e., the performance of the model based on that criterion is better than other ones). Italic and bold values indicate the result is significant.

110 Advances in Streamflow Forecasting

model efficiency classification relating to R2 as suggested by Dawson et al. (2007). Similarly, the VAR/VARX model is considered as “satisfactory”/“very satisfactory” based on R2 values. The absolute maximum error criterion revealed a less error for the VAR/VARX model than the VAR/VARX-DVECH model for almost all experiments/stations. The error in the peak streamflow estimation (PDIFF) for the VAR/VARX-DVECH model is less than that of the VAR/VARX model for most experiments/stations. As a result, the VAR/ VARX-DVECH model showing less error than the VAR/VARX model is more adequate in estimating maximum streamflows when one upstream and one downstream stations are jointly considered. In general, the VAR/VARXDVECH model is identified as the better model compared to the VAR/ VARX model, and hence, one can comprehend that the main advantage of the DVECH model is its ability to capture the conditional heteroscedasticity of the twofold residuals of the VAR/VARX model for both twofold streamflow and rainfall-runoff processes.

3.6 Conclusions In this chapter, the application of the MLN time series models that had not yet been used to model both conditional mean and variance behaviors for multiple/multivariate hydrologic processes is introduced. In this regard, the VAR/ VARX and DVECH models were used to model the first- and second-order moments of multiple daily streamflow and multivariate rainfall-runoff processes, respectively. For this purpose, the upstream watershed of Zarrineh Rood Dam located in northwestern Iran was selected as study area for introducing and applying the aforementioned models. The results showed that the twofold residuals obtained from the linear VAR/VARX model illustrated the adequacy of the fitted models to both multiple streamflow and multivariate rainfall-runoff processes for each experiment. The best-fitted models were verified with the multiple LjungeBox test. The Portmanteau test confirmed that the varianceecovariance of the twofold residual time series obtained from the fitted VAR/VARX model is not homoscedastic. Hence, the multiple/ multivariate linear models are not able to capture the time-varying varianceecovariance (the ARCH effect) existing in the multiple/multivariate processes. The ARCH effect (conditional heteroscedasticity) existing in the twofold residuals from the multiple/multivariate linear VAR/VARX model was fully captured using the DVECH model. The combination of a VAR/VARX model that was used to model mean behavior of hydrologic processes and a DVECH model that was used to model the time-varying variance of the twofold residuals obtained from the VAR/VARX model called the VAR/ VARX-DVECH error model. Thus, using such a defined VAR/VARX-DVECH error model, the multiple/multivariate daily hydrologic processes were well fitted. Capturing the conditional heteroscedasticity is the advantage of the DVECH model. This method improved the performance of the fitted models combined with this model based on the measures of performance evaluation criteria. Considering the cross product of the twofold residuals, which may

Introduction of multiple/multivariate linear Chapter | 3

111

lead to stabilize the variance and reduce the model error, the combined error model outperformed the VAR/VARX model.

References Akaike, H., 1973. Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (Eds.), 2nd International Symposium on Information Theory, 267e281. Akademia Kiado, Budapest. Bollerslev, T., Engle, R.F., Wooldridge, J.M., 1988. A capital asset pricing model with time varying covariances. J. Polit. Econ. 96 (1), 116e131. https://doi.org/10.1086/261527. Box, G.E.P., Jenkins, G.M., Reinsel, G., 2008. Time Series Analysis: Forecasting and Control, fourth ed. John Wiley&Sons, Inc, Hoboken, NJ, p. 712. Dawson, C.W., Abrahart, R.J., See, L.M., 2007. Hydrotest: a web-based toolbox of evaluation metrics for the standardized assessment of hydrological forecasts. Environ. Model. Softw. 22 (7), 1034e1052. https://doi.org/10.1016/j.envsoft.2006.06.008. Dawson, C.W., Wilby, R.L., 2001. Hydrological modeling using artificial neural networks. Prog. Phys. Geogr. 25 (1), 80e108. https://doi.org/10.1177/030913330102500104. Engle, R.F., 1982. Autoregressive conditional heteroscedasticity with estimates of variance of United Kingdom inflation. Econom. J. Econom. Soc. 50 (4), 987e1007. https://doi.org/ 10.2307/1912773. Fathian, F., 2019. Dynamic memory of Urmia Lake water-level fluctuations in hydroclimatic variables. Theor. Appl. Climatol. 138, 591e603. https://doi.org/10.1007/s00704-019-02844-6. Fathian, F., Fakheri-Fard, A., Modarres, R., van Gelder, P.H.A.J.M., 2018. Regional scale rainfallerunoff modeling using VARXeMGARCH approach. Stoch. Environ. Res. Risk Assess. 32 (4), 999e1016. https://doi.org/10.1007/s00477-017-1428-6. Fathian, F., Fakheri-Fard, A., Ouarda, T.B.M.J., Dinpashoh, Y., Nadoushani, S.S.M., 2019a. Multiple streamflow time series modeling using VAReMGARCH approach. Stoch. Environ. Res. Risk Assess. 33 (2), 407e425. https://doi.org/10.1007/s00477-019-01651-9. Fathian, F., Fakheri-Fard, A., Ouarda, T.B.M.J., Dinpashoh, Y., Nadoushani, S.S.M., 2019b. Modeling streamflow time series using nonlinear SETAR-GARCH models. J. Hydrol. 573, 82e97. https://doi.org/10.1016/j.jhydrol.2019.03.072. Fathian, F., Mehdizadeh, S., Sales, A.K., Safari, M.J.S., 2019c. Hybrid models to improve the monthly river flow prediction: integrating artificial intelligence and non-linear time series models. J. Hydrol. 575, 1200e1213. https://doi.org/10.1016/j.jhydrol.2019.06.025. Fathian, F., Modarres, R., Dehghan, Z., 2016. Urmia Lake water-level change detection and modeling. Model. Earth Syst. Environ. 2 (4), 1e16. https://doi.org/10.1007/s40808-016-0253-0. Fathian, F., Vaheddoost, B., 2021. Modeling the volatility changes in Lake Urmia water level time series. Theor. Appl. Climatol. 143, 61e72. https://doi.org/10.1007/s00704-020-03417-8. Francq, C., Zakoian, J.M., 2011. GARCH Models: Structure, Statistical Inference and Financial Applications. John Wiley&Sons, Ltd., United Kingdom, p. 504. Fu, C., Chen, J., Jiang, H., Dong, L., 2013. Threshold behavior in a fissured granitic catchment in southern China: 2. Modeling and uncertainty analysis. Water Resour. Res. 49 (5), 2536e2551. https://doi.org/10.1002/wrcr.20193. Hundecha, Y., Ouarda, T.B.M.J., Bardossy, A., 2008. Regional estimation of parameters of a rainfallerunoff model at ungauged watersheds using the spatial structures of the parameters within a canonical physiographiceclimatic space. Water Resour. Res. 44, W01427. https:// doi.org/10.1029/2006WR005439.

112 Advances in Streamflow Forecasting ¨ ., Shiri, A.A., 2015. Short-term and long-term streamflow prediction by Karimi, S., Shiri, J., Kis¸i, O using waveletegene expression programming approach. ISH J. Hydraul. Eng. 22 (2), 148e162. https://doi.org/10.1080/09715010.2015.1103201. ¨ ., Shiri, J., 2011. Precipitation forecasting using wavelet-genetic programming and waveletKis¸i, O neuro-fuzzy conjunction models. Water Resour. Manag. 25 (13), 3135e3152. https://doi.org/ 10.1007/s11269-011-9849-3. Kumar, A.R.S., Sudheer, K.P., Jain, S.K., Agarwal, P.K., 2005. Rainfallerunoff modelling using artificial neural networks: comparison of network types. Hydrol. Process. 19, 1277e1291. https://doi.org/10.1002/hyp.5581. Li, W.K., 2004. Diagnostic Checks in Time Series. Chapman & Hall/CRC, Boca Raton, FL, p. 210. Liu, G.Q., 2011. Comparison of Regression and ARIMA Models with Neural Network Models to Forecast the Daily Streamflow of White Clay Creek. PhD Dissertation. University of Delaware, Delaware, USA. Machado, F., Mine, M., Kaviski, E., Fill, H., 2011. Monthly rainfallerunoff modelling using artificial neural networks. Hydrol. Sci. J. 56 (3), 349e361. https://doi.org/10.1080/ 02626667.2011.559949. Machiwal, D., Jha, M.K., 2008. Comparative evaluation of statistical tests for time series analysis: application to hydrological time series. Hydrol. Sci. J. 53 (2), 353e366. https://doi.org/ 10.1623/hysj.53.2.353. Matalas, N.C., 1967. Mathematical assessment of synthetic hydrology. Water Resour. Res. 3 (4), 937e945. https://doi.org/10.1029/WR003i004p00937. Matalas, N.C., Wallis, J.R., 1971. Statistical properties of multivariate fractional noise processes. Water Resour. Res. 7 (6), 1460e1468. https://doi.org/10.1029/WR007i006p01460. McIntyre, N., Al-Qurashi, A., 2009. Performance of ten rainfallerunoff models applied to an arid catchment in Oman. Environ. Model. Software 24, 126e738. https://doi.org/10.1016/ j.envsoft.2008.11.001. Mehdizadeh, S., Fathian, F., Safari, M.J.S., Khosravi, A., 2020. Developing novel hybrid models for estimation of daily soil temperature at various depths. Soil Tillage Res. 197, 104513. https://doi.org/10.1016/j.still.2019.104513. Modarres, R., 2007. Streamflow drought time series forecasting. Stoch. Environ. Res. Risk Assess. 21 (3), 223e233. https://doi.org/10.1007/s00477-006-0058-1. Modarres, R., Ouarda, T.B.M.J., 2012. Generalized autoregressive conditional heteroscedasticity modelling of hydrologic time series. Hydrol. Process. 27 (22), 3174e3191. https://doi.org/ 10.1002/hyp.9452. Modarres, R., Ouarda, T.B.M.J., 2013a. Modeling rainfallerunoff relationship using multivariate GARCH model. J. Hydrol. 499, 1e18. https://doi.org/10.1016/j.jhydrol.2013.06.044. Modarres, R., Ouarda, T.B.M.J., 2013b. Modelling heteroscedasticity of streamflow time series. Hydrol. Sci. J. 58 (1), 54e64. https://doi.org/10.1080/02626667.2012.743662. Modarres, R., Ouarda, T.B.M.J., 2014a. Modeling the relationship between climate oscillations and drought by a multivariate GARCH model. Water Resour. Res. 50 (1), 601e618. https:// doi.org/10.1002/2013WR013810. Modarres, R., Ouarda, T.B.M.J., 2014b. A generalized conditional heteroscedastic model for temperature downscaling. Clim. Dynam. 43 (9e10), 2629e2649. https://doi.org/10.1007/ s00382-014-2076-x. Mohammadi, K., Eslami, H.R., Kahawita, R., 2006. Parameter estimation of an ARMA model for river flow forecasting using goal programming. J. Hydrol. 331, 293e299. https://doi.org/ 10.1016/j.jhydrol.2006.05.017.

Introduction of multiple/multivariate linear Chapter | 3

113

Niedzielski, T., 2007. A data-based regional scale autoregressive rainfall-runoff model: a study from the Odra River. Stoch. Environ. Res. Risk Assess. 21 (6), 649e664. https://doi.org/ 10.1007/s00477-006-0077-y. Salas, J.D., Delleur, J.W., Yevjevich, V.M., Lane, W.L., 1980. Applied Modeling of Hydrologic Time Series. Water Resources Publications, Littleton, USA, p. 484. Sarhadi, A., Kelly, R., Modarres, R., 2014. Snow water equivalent time-series forecasting in Ontario, Canada, in link to large atmospheric circulations. Hydrol. Process. 28 (16), 4640e4653. https://doi.org/10.1002/hyp.10184. Shu, C., Ouarda, T.B.M.J., 2008. Regional flood frequency analysis at ungauged sites using the adaptive neuro-fuzzy inference system. J. Hydrol. 349, 31e43. https://doi.org/10.1016/ j.jhydrol.2007.10.050. Sivapalan, M., Jothityangkoon, C., Menabde, M., 2002. Linearity and nonlinearity of basin response as a function of scale: discussion of alternative definitions. Water Resour. Res. 38 (2), 1012. https://doi.org/10.1029/2001WR000482. Solari, S., van Gelder, P.H.A.J.M., 2011. On the use of vector autoregressive (VAR) and regime switching VAR models for the simulation of sea and wind state parameters. Marine Technol. Eng. 1, 217e230. Tsay, R.S., 2010. Analysis of Financial Time Series, third ed. John Wiley & Sons, Inc, Hoboken, NJ, p. 720. Tsay, R.S., 2013. Multivariate Time Series Analysis: With R and Financial Applications. John Wiley&Sons, Inc, Hoboken, NJ, p. 520. Tse, Y.K., 2002. Residual-based diagnostics for conditional heteroscedasticity models. Econom. J. 5, 358e373. https://www.jstor.org/stable/23114899. Wang, D., Ding, H., Singh, V.P., Shang, X., Liu, D., Wang, Y., Zeng, X., Wu, J., Wang, L., Zou, X., 2015. A hybrid wavelet analysisecloud model data-extending approach for meteorologic and hydrologic time series. J. Geophys. Res. Atmos. 120 (9), 4057e4071. https://doi.org/10.1002/ 2015JD023192. Wang, Y.C., Chen, S.T., Yu, P.S., Yang, T.C., 2008. Storm-event rainfall-runoff modelling approach for ungauged sites in Taiwan. Hydrol. Process. 22 (21), 4322e4330. https://doi.org/10.1002/ hyp.7019.

Chapter 4

Concepts, procedures, and applications of artificial neural network models in streamflow forecasting Arash Malekian1, Nastaran Chitsaz2 1 Department of Reclamation of Arid and Mountainous Regions, Faculty of Natural Resources, University of Tehran, Tehran, Iran; 2National Centre for Groundwater Research and Training, College of Science and Engineering, Flinders University, Bedford Park, South Australia, Australia

4.1 Introduction Artificial intelligence (AI) techniques include knowledge-driven and datadriven computational methods and tools that provide solutions to the complex problems in different research areas (Agatonovic-Kustrin and Beresford, 2000). The AI techniques are divided into two categories: (i) knowledgedriven and (ii) data-driven. The first type of AI techniques is the knowledge-based expert systems, which simulates human experience and allows an expert to solve the problem and conclude from a set of rules through the thinking processes (Sukor et al., 2019). This method is useful for decisionmaking in environments which include vagueness and uncertainty. The second group of AI techniques involves data-driven mathematical tools such as artificial neural networks (ANNs), support vector machines, etc. ANN modeling is a form of AI technique that is widely used in literature for forecasting of a system variable specially in computer science research since 1980s (Yang and Yang, 2014). The goal of ANN modeling is to develop a simplified model of the human brain and mimic the brain behavior in learning tasks or processes to solve problems of several fields such as electrical engineering, agricultural engineering, process and food engineering, among others (Xia and Fan, 2012; Li and Chao, 2020; Agatonovic-Kustrin and Beresford, 2000). The sets of cells in the brain that transfer the information are called neurons, and each single neuron sends signals containing data and information to have interconnection between neurons (Agatonovic-Kustrin and Beresford, 2000). The ANN modeling is basically gathering knowledge by defining the pattern and Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00003-2 Copyright © 2021 Elsevier Inc. All rights reserved.

115

116 Advances in Streamflow Forecasting

relationships in the information to train and “learn by examples” through the experience instead of “programming processes” (Araghinejad, 2014). Any elements or aspects of learning including how to learn and how to induce or deduce are involved in ANN models, which is helpful in case of prediction and interpolation (Yang and Yang, 2014). The ANN models are able to recognize the fundamental behavior between the variables even if the original information is noisy and contains some errors. This feature facilitated the application of ANN modeling in solving the complicated problems that are difficult to understand, but data for training of the ANN model remain readily available. Thus, the ANN models generate an empirical relationship between input and output data with a training process to estimate any functions with an exact accuracy and continuously, especially in data-dependent problems where internal physical or process-based relationships between data are complicated or are not discovered (Nguyen et al., 2020). The self-adaptive and nonlinearity approaches makes the ANN a powerful tool for modeling, classification, noise reduction, pattern recognition, approximation, and prediction for the complex relationship in data (Govindaraju and Rao, 2000). The ANN modeling includes systems that model the way the human brain develops the information. Data gathering and knowledge processing in neural networks are performed by defining the patterns and relationships in dataset for clustering, classification, and simulation (Araghinejad, 2014). After recognizing a pattern in the data, the ANN models have the ability to adapt solutions over time and process information rapidly. The ANN models have the recursive process in learning from the input data and have a very flexible mathematical structure, and thus, they are time-saving models especially for the complex, nonlinear processes that may not be easily modeled by the traditional methods such as physical processebased models and stochastic time series models. Based on the aforementioned statements, ANN models are increasingly used in different research areas due to being efficient and attractive with remarkable success in pattern recognition even in complex tasks (Yang and Yang, 2014). One of the most important applications of ANN is in hydrology, climatology, and water resources management researches. Water resources systems need the streamflow forecasting to optimize their operation, while the spatiotemporal variability in streamflow makes it a critical problem in this regard (Dariane and Azimi, 2018). The reliability of ANN models in streamflow forecasting has been proved by many hydrologists due to their superiorities over the conceptual- and physical-based models (Kis¸i, 2010; Firat et al., 2009; Zhu et al., 2020; Toth and Brath, 2007; Javan et al., 2015). The objective of this chapter is to provide an overview of ANN models in streamflow forecasting along with describing the concept and the structure of salient ANN models.

Concepts, procedures, and applications of artificial Chapter | 4

117

4.2 Procedure for development of artificial neural network models 4.2.1 Structure of artificial neural network models The structure of an ANN contains neurons, input layer, output layer, hidden layer(s), weights, activation functions, and biases (Fig. 4.1). Each component of the ANN models is briefly explained ahead with their functioning.

4.2.1.1 Neurons and connection formula An artificial neuron is the main part of the ANN model to simulate the behavior of biological neurons. Each neuron is a unit that has the role of combining inputs (X) and comparing the inputs with threshold (q) to estimate appropriate output (Y) (Araghinejad, 2014). Weight (W) is used to control the significance of each input through a weight matrix, and the weights demonstrate the effectiveness of each input data. The highest value of input weight shows the highest impact on neural network. The intercept of the linear function is shown as bias (b), which is added to the function to adjust the outputs along with the sum of weighted inputs to the neurons. Therefore, the bias is used as an optimal fit of the ANN model for the given input data. An ANN model showing a mathematical relationship between different components including input and output data is expressed as Eq. (4.1) (Govindaraju and Rao, 2000): Xm Y ¼b þ W X (4.1) n¼1 n n

FIGURE 4.1 Structure of the artificial neural network.

118 Advances in Streamflow Forecasting

where X, input; W, weight; b, bias; m, number of inputs for each neuron, and Y, output. The transfer functions may be represented by different mathematical expressions as classified in the next section.

4.2.1.2 Transfer function The operation of the ANN is affected by the manner in which neurons are connected to each other. The inputs in artificial neurons are divided into either excitatory or inhibitory inputs, which shows how strongly one neuron affects the other neurons connected to the former neuron (Agatonovic-Kustrin and Beresford, 2000). The excitatory/inhibitory neuron can increase/decrease the firing rates of the connected neurons, respectively. The stronger connection means the more inhibition or excitation, and vice versa for a weak connection. The transfer function describes the variations in the neuron’s firing rate with the changes in the received input. The transfer function represents mathematical relations among inputs of a neuron such as summation of the inputs or average of the inputs, among other complicated processes. The threshold value of input data in Fig. 4.1 represents a transfer function, which can be of different types as shown in Table 4.1. 4.2.1.3 Architecture of neurons The simple architecture of the ANN consisted of the combination of an input layer, a transfer function, and an output layer. A single neuron is not sufficient for solving the complex practical problems. Thus, a neural network that contains perceptron is used in parallel as well as in series (Alotaibi et al., 2018). The number of neurons has a significant effect on efficiency and capability of an ANN model in providing solutions to the problems. There exist many types of neural networks in literature, though most of the networks have a similar basic structure. There are three main parts of the ANN structure: (i) first input layer with a set of input nodes (neurons), (ii) one or more hidden layers with hidden nodes or neurons, and (iii) last output layer with a set of output nodes (neurons) (McNelis, 2005). The input nodes contain all the information of the input variables in the form of digitized picture or numerical values that are needed by the developed neural network. The activation functions’ values of each node and its previous nodes are summed up and passed to the next node in the network. The passing activation processes are defined by connection strengths, inhibition or excitation conditions, and transfer functions (Agatonovic-Kustrin and Beresford, 2000). Therefore, the activation information passes through the neural network from the input layer to the hidden layers one by one and finally to the output layer. One of the structures of the neurons is feedforward network as shown in Fig. 4.2. In the feedforward network structure, the output of one layer is connected to the input of a previous layer. Each step in this network depends on the input signals and the previous states of the network, and thus, every step of the network keeps a memory of the previous steps. In feedforward neural

Concepts, procedures, and applications of artificial Chapter | 4

TABLE 4.1 Mathematical forms of the transfer functions (Araghinejad, 2014). Transfer function, f(x)

Mathematical expression

Linear

f ðxÞ ¼ x

Unit step

Piecewise linear (q: threshold)

( f ðxÞ ¼

0 if 0 > x 1 if x

0

8 > > < d xq f ðxÞ ¼ x; q  x  q > > : dx  q

Log sigmoid

f ðxÞ ¼

Tangent sigmoid

f ðxÞ ¼

1 a>0 1 þ e ax



2 1 þ e ax

 2

Radial basis (Gaussian) f ðxÞ ¼ e



x s

 1a>0

Graph

119

120 Advances in Streamflow Forecasting

FIGURE 4.2 Schematic of feedforward neural network model (Agatonovic-Kustrin and Beresford, 2000).

network, there is only a single connection from a node to the next layer’s nodes, and thus, there is no record of the previous output values (Fig. 4.3) (Agatonovic-Kustrin and Beresford, 2000).

4.2.2 Network training processes After defining the structure of the neural network for a particular application, it is trained with the observed data. The parameters that need to be defined before using the ANN modeling are values of weights and biases. The initial values for these parameters are chosen randomly, and then, the values are fixed

FIGURE 4.3 Schematic of feedback neural network model (Agatonovic-Kustrin and Beresford, 2000).

Concepts, procedures, and applications of artificial Chapter | 4

121

through the model training process. There are two distinctive types of training methods employed in the ANN modeling: (i) unsupervised method and (ii) supervised method (Lee et al., 2005). In unsupervised training method, input vectors are the only necessary data required for training of the model. Therefore, the extra information regarding output data is not needed to be embedded in unsupervised method of model training. The unsupervised method of the ANN model training was developed by Kohonen (1982), Kohonen and Somervuo (1998) and in other studies. In the unsupervised training method, the training algorithms modify the initially assigned values of random weight for processing of the input vectors in such a manner that the ANN model simulates the target values as similar to the observed values. The unsupervised training methods are commonly used in the real-timeebased problems under the fast-changing situations where a prompt response is needed (Hussain et al., 2018; Lee et al., 2005; Agatonovic-Kustrin and Beresford, 2000). In these situations, the accessibility to the sufficient additional information needed for the training process poses a serious limitation, and hence, the unsupervised method of model training is the most suitable method. In contrast, supervised training methods utilize the more critical variables as input and target (output) vectors in the training processes, and hence, the supervised training methods are expected to release more accurate simulation outputs than that yielded by the unsupervised training methods (Lee et al., 2005). However, there are some concerns associated with the supervised training processes related to their retrospective manner. The supervised training methods are mostly based on the analysis of historical data and accuracy of the simulation outputs are significantly affected under the fastchanging nature of the environment. Thus, it is difficult to say that the supervised training methods may provide correct real-time response solutions to the problems, especially for the future studies under the high probabilities of the changing environment. Moreover, the supervised methods rely on the input and target or output vectors together to make the model training possible. Due to the fact that the target vector is mostly available in the retrospective way, the availability of the target vectors is the main drawback of the supervised training methods (Jiang et al., 2017; Lee et al., 2005). Theoretical descriptions of the unsupervised and supervised training algorithms widely used in the ANN modeling are provided in the following subsections.

4.2.2.1 Unsupervised training method Unsupervised training methods have been used in many fields such as classification, clustering, forecasting, and pattern recognition (Chen et al., 2018). Unsupervised methods of model training are mostly applicable to model solutions to the real-time problems under the presence of an ongoing change in the environment. In an unsupervised training process, the input vector of model training is the only necessary data without the need of having an output

122 Advances in Streamflow Forecasting

vector data (Lee et al., 2005). Training algorithm for the unsupervised method has been developed by Kohonen (1982), Kohonen and Somervuo (1998), which is based on processing of input data with modification of weights and biases and resulting into some output data clusters or classes (Somervuo and Kohonen, 1999). Kohonen’s self-organizing map (SOM) is a widely used map-based unsupervised training method where an input space of training samples is processed to lower its dimensions by clustering of similar samples as shown in Fig. 4.4. In the training process, the neighborhood function is used to keep the topological characteristics of the input data. The size of the map in the output pffiffiffi layer depends on the number of input data and is defined as 5 n where n is the number of samples. After finalizing the network size, the SOM method initializes the weight vectors through the training process. The initial value of the weight is defined randomly among the vector of weights, and the best weight which represents the chosen sample is finalized based on the Euclidean distance for all map units (Kim et al., 2015). The Kohonen layer can have rectangular and hexagonal grids. The black points indicate the neurons, which are selected as the best match for the input data. The major steps followed in applying the SOM unsupervised method of training process are summarized below (Lee et al., 2005; Kim et al., 2015). Step 1: Initialize weight vector (Wi) as a random value. Step 2: Define an input vector (xj) to the network. Step 3: Calculate the Euclidean distance to find the closest matching unit cj to each input vector in Eq. (4.2).   (4.2) cj ¼ mini wj  xj 

Rectangular Kohonen layer

Hexagonal Kohonen layer

Connected weights

Xj

Input layer

Xj

FIGURE 4.4 Explaining dimensionality reduction in self-organized maps (Seif et al., 2014).

Concepts, procedures, and applications of artificial Chapter | 4

123

Step 4: Update the weight vectors of best matching unit and its neighboring units iteratively by using a neighborhood function to minimize the distance between unit i and unit cj (the winner) in Eq. (4.3). Pn j¼1 hcj;i  xj new wi ¼ Pn (4.3) j¼1 hcj;i where hcj;i , neighborhood function around the winner cj. Number of iterations in the SOM method is performed until the value converges.

4.2.2.2 Supervised training method In ANN modeling, backpropagation algorithm (BPA) based on the generalized delta rule is widely used as a supervised training method to define the weights and biases values (Hu¨sken and Stagge, 2003; Chitsaz et al., 2016; Hussain et al., 2018). The major reason behind the popularity of the BPA in the supervised training of the ANN models is its simplicity of understanding and implementation. The first step in applying this method is to guess the initial value of the weights and biases. The initial values of the weight and bias need to be updated through an iterative process. The iterative training process is continued to satisfy the threshold criteria such as correlation coefficient or root mean square error. The BPA based on delta rules is defined by Eqs. (4.4) and (4.5).   vE E ¼ ðy  y0 Þ2 / ðlÞ ¼ 2  yj  y0j (4.4) vyj ! vE newðlÞ oldðlÞ wij ¼ wij þ r  ðlÞ (4.5) vwij where E is the average error of estimation, y and y0 are target and simulated ðlÞ

output, respectively, wij is the weight for linking the ith neuron to the jth neuron in the lth layer, and r is the learning rate whose value varies between 0 and 1 (Govindaraju and Rao, 2000). The delta rule of the BPA method uses the mathematical chain rule of differentiation to calculate the gradient of the error function (Govindaraju and Rao, 2000). In each iteration, network weights are chosen through the negative of the gradient of the performance function in the steepest descent direction. The chain rule for the specific weight value in the lth hidden layer is calculated as follows (Govindaraju and Rao, 2000; Agatonovic-Kustrin and Beresford, 2000): For the lth hidden layer in Eq. (4.6) to Eq. (4.8): vE ðlÞ

vwij

¼

vE ðlÞ

vSj

ðlÞ



vSj

ðlÞ

vwij

(4.6)

124 Advances in Streamflow Forecasting

If

vE

¼ dj , then

ðlÞ

vSj

v X ðlÞ ðl1Þ  ðl1Þ ¼ w y ¼ yi ðlÞ ðlÞ k ij i vwij vwij   newðlÞ oldðlÞ ðlÞ ðl1Þ wij ¼ wij þ r dj y j ðlÞ

vSj

(4.7)

(4.8)

for the last layer in Eq. (4.9) vE

ðlÞ

dj ¼ 

ðlÞ

vSj

¼

ðlÞ

vE

vyj

vyj

vSj

 ðlÞ

ðlÞ

¼

 



0 0 0  f  y S ¼ 2  y j j j j  fj Sj ðlÞ

vE vyj

(4.9) for the hidden layer Eq. (4.10): ðlÞ

ðlþ1Þ   X vE vSk ðlÞ 0 ¼    f j Sj ðlÞ ðlþ1Þ ðlÞ vy vyj vS vSj k k   X ðlþ1Þ ðlþ1Þ ðlÞ 0 ¼  dk wjk  fj Sj

dj ¼ 

vE

ðlÞ

 ðlÞ

vyj

(4.10)

k

The main steps followed in applying the supervised BPA method for training of the ANN models are explained below. oldðlÞ

Step 1: Guess the initial value for biases and weights as wij . Step 2: Calculate the output for the first layer by Eq. (4.11) and for the hidden layer by Eq. (4.12). hX i wij xi yj ¼ ffj (4.11) ðlÞ

yj ¼ fj

hX

ðlþ1Þ

wlij xi

i (4.12)

Step 3: Find the error for the last layer by Eq. (4.13) and find the error for the hidden layers by backpropagation of the error in the last layer by Eq. (4.14).  

ðlÞ (4.13) dj ¼ yj  y0j  fj0 Sj ðlÞ

dj ¼

X

ðlþ1Þ

dk

ðlþ1Þ

wjk

  ðlÞ  fj0 Sj

(4.14)

k

Step 4: Finalize the weights’ value for the last layer by Eq. (4.15) and for the hidden layers by Eq. (4.16).

Concepts, procedures, and applications of artificial Chapter | 4

  ðl1Þ old wnew ij ¼ wij þ r dj yj newðlÞ

wij

oldðlÞ

¼ wij

125

(4.15)

  ðlÞ ðl1Þ þ r d j yj

(4.16)

The above steps are continued repeatedly until the criteria to stop the training is satisfied. The threshold criteria to stop the training processes are minimizing simulation error (test error) and model complexity related to the runtime of the process and number of iterations in the BPA. Thus, the stopped training point is related to the complexity of the model and is defined by underfitting (high bias) and overfitting (high variance) processes as the two main problems in the ANN simulation models (Chan et al., 2006). The factors responsible for underfitting and overfitting of the ANN models tend to reduce the performance capabilities of the ANN models. Consequently, bias is reduced and variance is increased in relation to the model complexity (Trevor, 2009). The low complexity increases the bias, which results in a poor generalization and underfitting of the model. On the other hand, the increase in the model complexity causes a harder fitting data and the training error tends to decrease. The high complexity increases the overfitting and the fitted model remains too close to the training data. Thus, the model overfitting yields in the poor generalization and increases the testing error. Typically, we should choose the model complexity level by considering biasevariance trade-off to minimize the testing error. A typical behavior of the testing and training errors as the variable of the model complexity is shown in Fig. 4.5. The hard fitting of data tends to decrease the training error and increase the testing error and vice versa (Trevor, 2009).

Low Bias High Variance

Prediction Error

High Bias Low Variance

Test Sample

Training Sample Low

High Model Complexity

FIGURE 4.5 Bias variance trade-off (Trevor, 2009).

126 Advances in Streamflow Forecasting

4.2.3 Artificial neural network to approximate a function Function approximation is the process to explore the underlying relationship between input and output data, which is defined as the fundamental problem in many applications in mathematics and engineering regarding prediction, data mining, pattern recognition, and classification and clustering (Mhaskar, 1993). Function approximation is used to select the best function among a set of welldefined classes that approximately matches to the target function. Function approximation is one of the most common applications of the ANN modeling. The ANN models have the ability to approximate any continuous real-valued function with a desired accuracy by means of the combination of sufficient activation functions (Araghinejad, 2014). The theory of the approximation function in mathematical terms is defined as Eqs. (4.17) and (4.18) (Mhaskar, 1993): ! n m X X ðAn f Þðx1 ; .; xm Þ ¼ wi 4 aij xj þ bi (4.17) i¼1

kf  An f kp ε

j¼1

(4.18)

where 4ð$Þ is an arbitrary activation function, n and m are the number of hidden neurons for different layers, wi and bi are the weight and bias for the ith neuron is the input for the jth neuron, ε shows the accuracy of approximation, which depends on how to measure closeness between functions regarding the specific problem to deal with. Procedure for the application of approximation function is explained in three steps: (i) using preprocessing technique to standardize and normalize the input data, (ii) choosing the best simulation (ANN) model by selecting suitable network architecture and network training method, and (iii) using postprocessing technique for validation of chosen ANN model.

4.2.3.1 Step 1: preprocessing of data The problems encountered in gathered data such as presence of outliers/extremes, unrealistic and erroneous data combinations, and gaps in data may cause the uncertainty and inappropriateness in the results obtained from the ANN modeling. These issues associated with data collection may be easily handled by employing data preprocessing techniques (Chitsaz et al., 2016). The data preprocessing techniques are used to improve the neural network training process, stability, and modeling performance after defining the input data as predictors (Kotsiantis et al., 2006). The main purpose of applying the preprocessing techniques is modification and transformation of the input data (predictor variables) in such a way to better match the simulation results with the target output. Choosing a suitable preprocessing technique increases the results’ accuracy and decreases the computational cost of the learning process.

Concepts, procedures, and applications of artificial Chapter | 4

127

The preprocessing techniques improve the ANN modeling results by modifying and reducing the training set size, by removing noise and outliers from the input data, and by correcting possible errors stemming from the unordinary training samples (Nawi et al., 2013). Normalization and principal component analysis (PCA) are the two most commonly data preprocessing techniques used in the ANN modeling. The widely used normalization techniques are minimum-maximum, z-score, and decimal scaling. 4.2.3.1.1 Data normalization techniques l

Minimum-maximum normalization technique uses the linear interpolation to rescale the input data that often range within [0, 1] or [1, 1] as Eq. (4.19) (Kotsiantis et al., 2006; Nawi et al., 2013). x0 ¼

l

l

x  minx maxx  minx

(4.19)

where x is the value of the input data and x0 is the normalized value of the x. Z-score normalization is also known as zero-mean normalization, which uses the mean and standard deviation for each set of input data to generate a set of normalized data. The transformation formula is expressed as Eq. (4.20) (Nawi et al., 2013; Kotsiantis et al., 2006). x  mx x0 ¼ (4.20) sx Where mx and sx are the mean and standard deviation of x as a set of input data. The new normalized dataset has a zero mean and unit variance. Decimal scaling normalization moves the decimal point of values of the data as Eq. (4.21) (Nawi et al., 2013). x x0 ¼ m (4.21) 10 where x0 is normalized value of X, m is the digit number of x value that maxjx0 j < 1.

4.2.3.1.2 Principal component analysis PCA technique reduces the dimension of the input datasets by grouping them in the principal components to simplify the analysis (Hotelling, 1933). In PCA, the characteristics of the input data that mostly contribute to the data variance are highlighted. On the other hand, the datasets that do not reflect the most of the variance of the system are ignored in PCA. The PCA is based on transforming the coordinate system, and for a two-dimensional dataset, the

128 Advances in Streamflow Forecasting

coordinate changes in a way to rotate x and y axes and plot the data on the new axes x’ and y’. The first new axis x’ should have the same direction as the maximum variation in the data and the second axis y’ should be in the direction of the next-most variation. The new axis is set by eigenvalue and eigenvector of the covariance matrix of the data. The first axis with highest variation has the highest eigenvalue (Chitsaz and Hosseini-Moghari, 2018). The new dataset is generated by taking the weighted average of the original dataset as shown in Eqs. (4.22) and (4.23) (Wu et al., 2010). Y ¼ W0  X

(4.22)

yij ¼ w1i x1j þ . þ wpi xpj

(4.23)

where Y is the new dataset called principal components, X is the original dataset with n number of observation and p variables, W is the matrix of coefficients defined by the PCA in Eq. (4.24) to Eq. (4.26) (Wu et al., 2010). 1 W ¼U  V 2 

U  U 02 ¼ V    n P xim  xi xjm  xj Zij ¼ m¼1

n1

(4.24) (4.25)

(4.26)

where Z is the varianceecovariance matrix of the input data x, U is the matrix of eigenvector of Z, and V is the diagonal matrix of eigenvalue for Z.

4.2.3.2 Step 2: choosing the best network architecture In designing the ANN architecture, the parameters that are considered involve number of hidden layers, number of neurons in each layer (input, hidden, and output), specific transfer (activation) functions, connection patterns between layers, and learning methods (Hussain et al., 2018). The architecture of the ANN model determines how a network transforms its input into an output. Likewise, different interconnection patterns of neurons demonstrate the ANN functionality. The most common networks in water resources and environmental engineering are multilayer perceptron (MLP), tapped delay line (TDL), radial basis function (RBF), generalized regression neural network (GRNN), and probabilistic neural networks (Firat et al., 2009; Kis¸i, 2010; Ghorbani et al., 2016; Chitsaz et al., 2016). These models are described in the next sections in more detail. Network training is an important step in ANN modeling used as mapping process to calibrate the chosen neural network. Each ANN architecture may get affected by underfitting and overfitting in training process, and the same should be alleviated by calibration method. The best design of the network architecture changes the complexity of the ANN with consideration of

Concepts, procedures, and applications of artificial Chapter | 4

129

trade-off between bias and variance and decreases the adverse effects of overfitting and underfitting in model training step (Trevor, 2009). The optimum number of neurons and hidden layers is determined by trial and error process comparing the error in the training and testing processes of the ANN model.

4.2.3.3 Step 3: postprocessing of data Postprocessing techniques are used for validation of the chosen ANN model (Chitsaz and Hosseini-Moghari, 2018; Belotti et al., 2018). The accuracy of the ANN models is assessed through performance evaluation in training and testing stages employing statistical metrics of goodness-of-fit criteria such as linear correlation coefficient (R2), root mean square error, volume error, NasheSutcliffe efficiency, and discrepancy ratio (Nash and Sutcliffe, 1970; Chitsaz et al., 2016). Mathematical expressions of these statistical metrics are summarized in Table 4.2. If the postprocessing techniques evaluating the performance of the ANN models do not find satisfactory results of the model simulations, then the network architecture needs to be modified through an iterative process until the estimation error between simulated and observed data decreases up to level of

TABLE 4.2 Postprocessing techniques and their mathematical expressions (Chitsaz and Hosseini-Moghari, 2018). Postprocessing techniques Linear correlation coefficient (R)

Mathematical expressiona 2



3

ðyi y Þ y^i ^y i¼1

6 7 6 7 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi R ¼6  ffi 7 4pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 5 Pn Pn y^i ^ y ðyi y Þ i¼1 i¼1 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2 Pn

Root mean square error (RMSE)

Volume error (VE) VE ¼

yi ^ yi

i¼1

RMSE ¼

NasheSutcliffe efficiency (NSE)



Pn

n

Pn yi  ybi i¼1 yi n

2  2 3 Pn 6 i¼1 yi ^y i 7 7 6 NSE ¼ 1  6 Pn 2 7 4 ðyi y Þ 5 i¼1

Discrepancy ratio (Dr) Dr ¼

Pn yi i¼1 ybi n

where n is the number of observed values; yi , ybi , y, and yb are the ith observed, simulated, mean of observed values, and mean of simulated values, respectively.

a

130 Advances in Streamflow Forecasting

acceptance (Araghinejad, 2014). The iterative procedure involves changing number of hidden layers, number of neurons in each layer, transfer functions, and initial guesses for the biases and weights.

4.3 Types of artificial neural networks 4.3.1 Multilayer perceptron neural network MLP neural network model uses the supervised BPA for training of the ANN model (Lee and Kang, 2016; Ghorbani et al., 2016). It is one of the most common types of feedforward neural networks. In this ANN model, neurons in the input layer have the buffer role to distribute the input signals to neurons in the hidden layers. Each neuron in hidden layers presents the sum of its weighted input signals as Eqs. (4.27) and (4.28) (Araghinejad, 2014). Sj ¼

n X

wij xi þ bj

(4.27)

i¼1

where Sj is the weighted input into the jth hidden layer, n is the number of independent variables, wij is the weight between ith neuron and the jth neuron, xi is the value of the ith input, and bi is the bias for the jth neuron. " # m n

X X wjk  g yb ¼ f (4.28) wij xi þ bj þ bk j¼1

i¼1

where yb is the simulated output, m is the number of neurons in hidden layer, bk is the bias for the kth neuron, wjk is the weight between jth neuron and the kth neuron, f and g are the transfer functions, which are chosen from the functions given in Table 4.1. In addition, the training process should minimize the error function E for providing accurate output from input data in Eq. (4.29) (McNelis, 2005):  2 k P yk  ybk (4.29) E ¼ k¼1 k where k is the number of input or output data, yk and ybk are the kth observed and estimated data, respectively.

4.3.2 Static and dynamic neural network The ANN models are classified into two groups as static and dynamic models based on time dependency of datasets (Tsoi and Back, 1995; Chiang et al., 2004). Most of the engineering applications involve the static neural networks. In the static neural networks, the input and output data remain interconnected directly through a feedforward type of network connection without any delays.

Concepts, procedures, and applications of artificial Chapter | 4

131

The static inputeoutput structure is suitable for applications where the input and output vectors have spatial patterns without any temporal variability. However, there are some limitations in the static networks such as the following: (i) in static feedforward neural network, the neurons transfer the information from input to output layers in forward direction and the information never come back to the previous layers, and (ii) the pattern of input data includes temporal signals in many time-dependency tasks, which the static neural network does not consider due to their inability in accounting for the dynamic nature of the applications (Hussain et al., 2018; Sinha, 2000). However, unlike the static neural network, in dynamic neural network (DNN) structures, the connection between inputs and outputs is based on both spatial and temporal variabilities with an extensive feedback among neurons and layers in the network (Chiang et al., 2007). The output in the DNN depends on both the current and previous status of the inputs and network states. This capability demonstrates the local memory characteristic in the DNNs. The DNN is of two types: (i) TDLs and (ii) recurrent neural network (RNN) (Chiang et al., 2007; Araghinejad, 2014). The TDL contains several time delay procedures, which keep the time series information in a memory box as shown in Fig. 4.6. The signals of each input with different time steps are recorded in the memory, and the same are passed to the output layer. The MLP network can be improved as a time delay neural network when it is combined with the TDL neural network (Wan, 1994). The RNN network not only considers temporal processing similar to the TDL network but also represents time dimensions and its effects on the training process (Araghinejad, 2014; Belotti et al., 2018). The recursive connection between layers transfers the information from the output layer back to the hidden layer and then back to the input layer in each time step (Fig. 4.7). The RNN network demonstrates both linear and complex nonlinear relations in

FIGURE 4.6 Structure of tapped delay line neural network model.

132 Advances in Streamflow Forecasting

Time delay y

1 2 N

1

1

2

2

Output

Input

1 2

z

x

M Input layer

N W

V Processing layer

K Output layer

FIGURE 4.7 The structure of a recurrent neural network model (Chiang et al., 2007).

the input and output datasets in prediction and classification types of dynamic applications. The RNN network has the memory for storing the complex time series’ information including extensive dynamic behavior. One of the most important applications of the RNN model is in measuring temporal dynamic patterns, which makes these networks able to be applied for both linear and nonlinear decision boundaries, static and dynamic, and temporal and spatial modeling. This kind of network is beneficial in time-dependent applications such as streamflow analysis and modeling (Hu¨sken and Stagge, 2003; Chung et al., 2009).

4.3.3 Statistical neural networks The RBF network is a type of statistical neural networks, which has an extensive application as nonlinear modeling tools in classification, simulation, time series prediction, and function approximation (Ghorbani et al., 2016; Ateeq Ur et al., 2018). The application of the RBF neural network has some merits such as no lien capability of model training, which is necessary in realtime signal processing, and its simple structure that is easily developed and trained with relatively less time. A standard form of the RBF network is based on the feedforward neural network architecture, which consists of three layers, namely the input, hidden (contains a number of RBF nonlinear activation units), and linear output layers as shown in Fig. 4.8. There is only one hidden layer in the RBF neural network, where the number of neurons is equal to the number of input and prediction outputs. Each neuron in the hidden layer transfers the information of predictor or observed data. The output of the hidden nodes is produced by the Euclidean distance between input and the center of the basis function. The basis function is a symmetric function with a unique maximum at its center,

Concepts, procedures, and applications of artificial Chapter | 4

133

FIGURE 4.8 The structure of radial basis function neural network model.

which drops off to zero away from the center. The Gaussian function is the most common activation function that is used in the RBF network as Eqs. (4.30) and (4.31) (Govindaraju and Rao, 2000; Ghorbani et al., 2016): yj ¼

k X

wij  el

2

(4.30)

j¼1



  x  mj  sj

(4.31)

where x is the input vector, mj is the center of the basis function, sj is the radius width of the jth hidden node, and k is the number of hidden nodes. The RBF neural network has been used extensively in nonlinear system modeling for training and classification processes because of the simplicity and fast-training characteristics of the network model. In RBF method, input data have significant activation in a small region (due to the application of Euclidean distance between input and the center), and in consequence, the opportunity of getting stuck at local minima is small. In contrast, in BPA in

134 Advances in Streamflow Forecasting

MLP method, there is always the possibility of getting stuck in local minimum (Markopoulos et al., 2016; Wu et al., 2012). A GRNN model is a modified version of the RBF neural network model as suggested by Specht (1991). The GRNN model is considered as a normalized RBF model with a specific center at each training case. The GRNN model is based on the nonlinear regression theory with feedforward and backpropagation training processes. It approximates the regression function between input value x and output value y directly from the training process (Kis¸i, 2010). The GRNN model has the ability to approximate any smooth functions with the assumption of nonlinearity problems. The GRNN algorithm is presented through Eq. (4.32) to Eq. (4.34) (Kis¸i, 2010; Chitsaz et al., 2016): n P

wij hi yj ¼ i¼1 n P hi

(4.32)

i¼1

 2 1 Di hi ¼  exp 2 s qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Di ¼ ðx  x0 ÞT ðx  x0 Þ

(4.33) (4.34)

where y is the output of the simulation, x is the input estimator, wij is the weights of the output layer, which are considered as target values, x0 is the training vector of x, and s is the constant to control the size of the perceptive region. The smoothing factor, s, alters the degree of generalization of the network. The range of the smoothing factor is [0, 1] where value of 0 demonstrates the dot-to-dot map and value of 1 indicates straight path of the prediction line. Higher value for the smoothing factor increases the generalization ability of the network, and thus, the range of the smoothing factor is optimized for the model inputs. The structure of GRNN model is shown in Fig. 4.9.

4.4 An overview of application of artificial neural network modeling in streamflow forecasting Streamflow is one of the most important components of water cycle and its forecasting is of utmost importance for the planning and management of water resources at watershed, catchment, basin, regional, and continental spatial scales (Ateeq Ur et al., 2018; Yaseen et al., 2015; Maier et al., 2010). Streamflow directly affects stream ecosystems, and its accurate prediction with consideration of spatiotemporal variability is important for environmental and water resources management (Banihabib et al., 2020; Chitsaz and Banihabib, 2015).

Concepts, procedures, and applications of artificial Chapter | 4

135

FIGURE 4.9 The structure of generalized regression neural network (Araghinejad, 2014).

The characteristic patterns of streamflow are complex and not easy to predict because of the nonlinearity and nonstationarity characteristics of streamflow (Tyralis et al., 2019; Yaseen et al., 2015). Traditional hydrological models have the physical and mathematical-based representation of various watershed processes, and thus, they are essentially linear models, which are appropriate for the stationary data. As most of the developed hydrological models are based on the assumption that datasets are linear and are incapable in handling and processing of nonstationary and nonlinear hydrologic datasets, these models are not appropriate for solving nonlinear problems (Ravansalar et al., 2017). In addition to constraints of not dealing with the nonlinearity and nonstationary behavior in the data distributions of stochastic time series, hydrological models need extensive data of various hydrologic processes including rainfall, streamflow, and watershed characteristics such as land use/ land cover, soil types, hydrologic soil condition, etc. Necessity of calibration and validation of the hydrological models is another drawback of the hydrological models (Ateeq Ur et al., 2018). On the other hand, the data-driven AI models such as ANN models have the capability of minimizing the necessity of collecting extensive hydrological data including watershed processes and characteristics. Although ANN models do not need a stationary and normally distributed data, they have the capability to capture the nonstationary features in natural global phenomena into their inner structure and transfer their effects into the outputs. ANN models have the ability to generalize the inputeoutput relationships with consideration of any noise in the datasets (Dolling and Varas, 2002). Thus, the ANN models are applied as an alternative to the physically based hydrological models. Over the last few decades, the ANN

136 Advances in Streamflow Forecasting

models have been increasingly used in the field of hydrology and water resources management in the context of rainfall-runoff modeling, streamflow forecasting, water quality assessment, and groundwater modeling (Poul et al., 2019; Wu et al., 2005; Zhu et al., 2020). In recent hydrological studies, the ANN models have been increasingly employed by the researchers worldwide for obtaining the improved forecasting results by precisely simulating the nonlinear hydrological processes with or without involving a complex algorithm and solutions (Allawi et al., 2019; Shirgure, 2013; Toth and Brath, 2007; Dariane and Azimi, 2018). It is revealed from the literature that the ANN models are efficient in accurate simulation of mostly two hydrological phenomena, i.e., rainfall and streamflow (Shoaib et al., 2016; Hu et al., 2005; Mutlu et al., 2008; Kagoda et al., 2010). In some researches, temperature and evaporation have been considered as effective parameters in streamflow forecasting (Noori et al., 2011; Li et al., 2010; Ateeq Ur et al., 2018). In addition, in few studies, climate indices such as Southern Oscillation Index (SOI), North Atlantic Oscillation (NAO), Pacific Decadal Oscillation (PDO), and Atlantic Multi-decadal Oscillation Index (AMO) have been applied to improve the streamflow forecasting processes by ANN models (Chitsaz et al., 2016; Carrier et al., 2013). The ANN models provide accurate estimates of streamflow forecasts due to their capability in (i) representing the nonlinear functions for the complex trained networks, (ii) finding the relationships between different datasets for data classification and clustering, and (iii) generalizing the defined relationships in small size dataset with the existence of the noise and missing data in input dataset (Toth and Brath, 2007; Javan et al., 2015). The streamflow forecasting processes may be classified into two groups based on temporal scales. First category is called short-term forecasting processes at hourly and daily time scales, which is essentially for the real-time operations in water resources systems, flood warning, and mitigation systems. Second category is known as long-term forecasting processes related to weekly, monthly, and annual time scales, which is mainly used for reservoir planning and operations, synthesizing reservoir inflow series, assessing sediment transportation, and planning of hydropower and irrigation systems. Details of the salient studies where the ANN models are used in streamflow forecasting at short-term and long-term time scales are summarized in Table 4.3. It is apparent from the literature that the most common ANN models used in streamflow forecasting are MLP, DNNs such as TDLs and RNN, and statistical neural networks such as RBF and GRNN. It is seen that the MLP, RBF, and GRNN kind of ANN models are used for forecasting streamflow at different time scales in the past studies. The major drawbacks and limitations in applying the ANN models are the slow learning speed, trapped into the local minima, learning rate, stopping criteria, and overfitting problems (Chitsaz et al., 2016; Hussain et al., 2018). Of the several advantages of the ANN models, the MLP neural network has the ability to approximate complex

TABLE 4.3 Salient details of literature studies using artificial neural network (ANN) models in streamflow forecasting at long-term and short-term time scales.

Yearly

Monthly

Source

Case study

Type of ANN model

Input parameters

Evaluation criteria

Carrier et al. (2013)

Western US

Linear function kernel

Streamflow, climate oscillation (El Nin˜oe Southern OscillationeNorth Atlantic Oscillation [NAO]ePacific Decadal Oscillation [PDO]eAtlantic Multidecadal Oscillation Index [AMO])

Root mean square error observations standard deviation ratioe Pearson’s correlation coefficienteNashe Sutcliffe coefficient of efficiency

Zhang et al. (2016)

China

Radial basis function

Streamflow

Root mean square erroremean absolute error

Jain and Kumar (2007)

USA

Multilayer perceptron

Streamflow

Absolute relative errore correlation coefficient

Ahmed and Sarma (2007)

India

Multilayer perceptron

Streamflow

Mean square errore mean relative error

Li et al. (2010)

Taiwan

Radial basis function

Streamfloweglobal sea surface temperatureesea level pressuree outgoing long-wave radiation

Correlation coefficient

Noori et al. (2011)

Iran

Radial basis function

Rainfalledischargeesun radiationeair temperature

Correlation coefficiente mean absolute errore root mean square error

137

Continued

Concepts, procedures, and applications of artificial Chapter | 4

Time scale

Time scale

Source

Case study

Type of ANN model

Input parameters

Evaluation criteria

Sudheer et al. (2013)

India

Radial basis function

Streamflow

Normalized mean squared errorecorrelation coefficient

Danandeh Mehr et al. (2015)

Turkey

Generalized regression neural networkeradial basis function

Streamflow

NasheSutcliffe coefficiente root mean square errore correlation coefficient

Terzi and Ergin (2014)

Turkey

Radial basis function

Streamflow

Root mean square errore correlation coefficient

Awchi (2014)

Iraq

Generalized regression neural networkeradial basis function

Streamflow

Correlation coefficiente mean square errore relative error

Chitsaz et al. (2016)

Iran

Multilayer perceptrone radial basis functione generalized regression neural network

Streamflowerainfalleclimate indices (Southern oscillation IndexeNAOe PDOeAMO Index)

Root mean square errore NasheSutcliffe efficiency volume errorelinear correlation

Ghorbani et al. (2016)

Iran

Multilayer perceptrone radial basis function

Streamflow

Correlation coefficiente root mean square error

Adnan et al. (2017)

China

Multilayer perceptron

Streamflow

Correlation coefficiente root mean square errore mean absolute error

Ateeq Ur et al. (2018)

Pakistan

Radial basis function

Precipitationeair temperatureestreamflow

Root mean square errore mean bias errore correlation coefficient

138 Advances in Streamflow Forecasting

TABLE 4.3 Salient details of literature studies using artificial neural network (ANN) models in streamflow forecasting at long-term and short-term time scales.dcont’d

Multilayer perceptron

Streamflow

Mean absolute erroreroot mean square error

Fathian et al. (2019)

Canada

Radial basis function egeneralized regression neural network

Streamflow

Root mean square erroremean absolute errorecorrelation coefficienteNashe Sutcliffe model efficiency coefficient

Honorato et al. (2019)

Brazil

Multilayer perceptron

Streamflow

Root mean square errore mean absolute relative errorecoefficient of efficiency

Belotti et al. (2018)

Brazil

Recurrent neural network

Streamflow

Mean square erroremean absolute error

Hsu et al. (1998)

Canada

Tapped delay lines erecurrent neural network

Precipitationetemperaturee streamflow

Root mean square error

Dibike et al. (2001)

China, Vietnam, Nepal

Radial basis function

Catchment areaerainfalle evaporationestreamflow

Root mean square error

Cigizoglu (2005)

Turkey

Generalized regression neural network

Streamflow

Mean square errore correlation coefficient

Hu et al. (2005)

China

Generalized regression neural networkeradial basis function

Streamfloweprecipitation

Sum of the squares of the errorsethe mean squared errors

Sivapragasam and Liong (2005)

Denmark

Radial basis function

Streamflow

Root mean square errorecoefficient of determinationecorrelation coefficientemean absolute error Continued

139

Egypt

Concepts, procedures, and applications of artificial Chapter | 4

Daily

Elganiny and Eldwer (2018)

Time scale

Source

Case study

Type of ANN model

Input parameters

Evaluation criteria

Mutlu et al. (2008)

USA

Multilayer perceptron eradial basis function

Streamfloweprecipitation

Mean square errore correlation coefficiente NasheSutcliffe coefficient of model efficiency

Behzad et al. (2009)

Iran

Radial basis function

Streamflow

Root mean square errore correlation coefficient

Kagoda et al. (2010)

South Africa

Radial basis function

Streamfloweprecipitation

NasheSutcliffe efficiencye root mean square errore bias between the simulated and observed data

Yonaba et al. (2010)

US, Canada, and France

Multilayer perceptron

Streamflow

Mean absolute erroreroot mean square erroreNashe Sutcliffe efficiency

He et al. (2014)

China

Radial basis function

Streamflow

Root mean square errore NasheSutcliffe efficiency coefficientemean absolute relative error

Yaseen et al. (2016)

Malaysia

Radial basis function

Streamflow

Mean absolute erroreroot mean square erroremean square errorecorrelation coefficienterelative error

140 Advances in Streamflow Forecasting

TABLE 4.3 Salient details of literature studies using artificial neural network (ANN) models in streamflow forecasting at long-term and short-term time scales.dcont’d

India

Recurrent neural networkeradial basis function

Streamflowerainfall

Root mean square errore correlation coefficient

Banihabib et al. (2019)

Iran

Recurrent neural network

Streamflow

Root mean square errore mean bias error

Apaydin et al. (2020)

Turkey

Generalized regression neural networke recurrent neural network

Streamflowerainfall

Correlation coefficientse NasheSutcliffe coefficienteroot mean square erroremean absolute error

Hashimi and Dalkilic¸ (2020)

Greek

Multilayer perceptron

Precipitationeair temperatureehumidity estreamflow

Mean square erroremean absolute errorecorrelation coefficient

(Wu et al., (2005))

USA

Multilayer perceptron

Streamfloweprecipitation

Correlation coefficiente mean squared erroremean relative error

Toth and Brath (2007)

Italy

Multilayer perceptron

Streamfloweprecipitation

Efficiency coefficiente mean absolute error

Hu et al. (2020)

China

Multilayer perceptrone recurrent neural network

Streamfloweprecipitation

Root mean square errore mean absolute errore correlation coefficient

Concepts, procedures, and applications of artificial Chapter | 4

Hourly

Sahoo et al. (2019)

141

142 Advances in Streamflow Forecasting

nonlinear functions accurately, whereas the RBF and GRNN kind of statistical neural network models have advantages of higher reliabilities, faster convergence, and lower extrapolation errors (Kis¸i, 2010; Firat et al., 2009). The DNNs such as TDL and RNN have appropriate results in rainfall-runoff modeling specifically in terms of the peak flow events (Kourgialas et al., 2015). The DNN models can reduce the time involved in training processes by decreasing the dimension of network’s input. In addition, the adjustable simulation results in the DNN models elucidate that the output data depend not only on the current situation of input signals but also on the previous status of the signals in each time step (Araghinejad, 2014; Yaseen et al., 2015).

References Adnan, R.M., Yuan, X., Kisi, O., Yuan, Y., 2017. Streamflow forecasting using artificial neural network and support vector machine models. Am. Sci. Res. J. Eng., Technol., Sci. 29, 286e294. Agatonovic-Kustrin, S., Beresford, R., 2000. Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. J. Pharmaceut. Biomed. Anal. 22, 717e727. https://doi.org/10.1016/S0731-7085(99)00272-1. Ahmed, J.A., Sarma, A.K., 2007. Artificial neural network model for synthetic streamflow generation. Water Resour. Manag. 21, 1015e1029. https://doi.org/10.1007/s11269-006-9070-y. Allawi, M.F., Binti Othman, F., Afan, H.A., Ahmed, A.N., Hossain, M., Fai, C.M., El-Shafie, A., 2019. Reservoir evaporation prediction modeling based on artificial intelligence methods. Water 11, 1226. https://doi.org/10.3390/w11061226. Alotaibi, K., Ghumman, A.R., Haider, H., Ghazaw, Y.M., Shafiquzzaman, M., 2018. Future Predictions of Rainfall and Temperature Using GCM and ANN for Arid Regions: A Case Study for the Qassim Region, Saudi Arabia. Water, 10. https://doi.org/10.3390/w10091260. Apaydin, H., Feizi, H., Sattari, M.T., Colak, M.S., Shamshirband, S., Chau, K.-W., 2020. Comparative Analysis of Recurrent Neural Network Architectures for Reservoir Inflow Forecasting. Water, 12. https://doi.org/10.3390/w12051500. Araghinejad, S., 2014. Data-Driven Modeling: Using MATLABÒ in Water Resources and Environmental Engineering, vol. 67. Water Science and Technology Library. Ateeq Ur, R., Ghumman, A.R., Ahmad, S., Hashmi, H.N., 2018. Performance assessment of artificial neural networks and support vector regression models for stream flow predictions. Environ. Monit. Assess. 190, 704. https://doi.org/10.1007/s10661-018-7012-9. Awchi, T.A., 2014. River discharges forecasting in northern Iraq using different ANN techniques. Water Resour. Manag. 28, 801e814. https://doi.org/10.1007/s11269-014-0516-3. Banihabib, M.E., Bandari, R., Peralta, R.C., 2019. Auto-regressive neural-network models for long lead-time forecasting of daily flow. Water Resour. Manag. 33, 159e172. https://doi.org/ 10.1007/s11269-018-2094-2. Banihabib, M.E., Chitsaz, N., Randhir, T.O., 2020. Non-compensatory decision model for incorporating the sustainable development criteria in flood risk management plans. SN Appl. Sci. 2 (1), 1e11. https://doi.org/10.1007/s42452-019-1695-6. Behzad, M., Asghari, K., Eazi, M., Palhang, M., 2009. Generalization performance of support vector machines and neural networks in runoff modeling. Expert Syst. Appl. 36, 7624e7629. https://doi.org/10.1016/j.eswa.2008.09.053.

Concepts, procedures, and applications of artificial Chapter | 4

143

Belotti, J.T., Lazzarin, L.N.A., Usberti, F.L., Siqueira, H., 2018. Seasonal streamflow series forecasting using recurrent neural networks. In: 2018 IEEE Latin American Conference on Computational Intelligence (LA-CCI). Carrier, C., Kalra, A., Ahmad, S., 2013. Using paleo reconstructions to improve streamflow forecast lead time in the western United States. J. Am. Water Res. Associ. 49, 1351e1366. https://doi.org/10.1111/jawr.12088. Chan, Z.S., Ngan, H., Rad, A.B., David, A., Kasabov, N., 2006. Short-term ANN load forecasting from limited data using generalization learning strategies. Neurocomputing 70, 409e419. https://doi.org/10.1016/j.neucom.2005.12.131. Chen, C.C., Juan, H.H., Tsai, M.Y., Lu, H.H.S., 2018. Unsupervised learning and pattern recognition of biological data structures with density functional theory and machine learning. Sci. Rep. 8, 1e11. https://doi.org/10.1038/s41598-017-18931-5. Chiang, Y.M., Chang, L.C., Chang, F.J., 2004. Comparison of static-feedforward and dynamicfeedback neural networks for rainfallerunoff modeling. J. Hydrol. 290, 297e311. https:// doi.org/10.1016/j.jhydrol.2003.12.033. Chiang, Y.M., Chang, F.J., Jou, B.J.D., Lin, P.F., 2007. Dynamic ANN for precipitation estimation and forecasting from radar observations. J. Hydrol. 334, 250e261. https://doi.org/10.1016/j. jhydrol.2006.10.021. Chitsaz, N., Azarnivand, A., Araghinejad, S., 2016. Pre-processing of data-driven river flow forecasting models by singular value decomposition (SVD) technique. Hydrol. Sci. J. 61, 2164e2178. https://doi.org/10.1080/02626667.2015.1085991. Chitsaz, N, Banihabib, M.E, 2015. Comparison of different multi criteria decision-making models in prioritizing flood management alternatives. Water Resour. Manag. 29 (8), 2503e2525. https://doi.org/10.1007/s11269-015-0954-6. Chitsaz, N., Hosseini-Moghari, S.M., 2018. Introduction of new datasets of drought indices based on multivariate methods in semi-arid regions. Nord. Hydrol. 49, 266e280. https://doi.org/10. 2166/nh.2017.254. Chung, J.R., Kwon, J., Choe, Y., 2009. Evolution of recollection and prediction in neural networks. IJCNN: 2009 Int. Joint Confer. Neural Network. 1e6, 3363e3369. https://doi.org/10.1109/ IJCNN.2009.5179065. Cigizoglu, H.K., 2005. Application of generalized regression neural networks to intermittent flow forecasting and estimation. J. Hydrol. Eng. ASCE 10, 336e341. https://doi.org/10.1061/ (ASCE)1084-0699(2005)10:4(336). Danandeh Mehr, A., Kahya, E., Sahin, A., Nazemosadat, M.J., 2015. Successive-station monthly streamflow prediction using different artificial neural network algorithms. Int. J. Environ. Sci. Technol. 12, 2191e2200. https://doi.org/10.1007/s13762-014-0613-0. Dariane, A., Azimi, S., 2018. Streamflow forecasting by combining neural networks and fuzzy models using advanced methods of input variable selection. J. Hydroinf. 20, 520e532. https:// doi.org/10.2166/hydro.2017.076. Dibike, Y.B., Velickov, S., Solomatine, D., Abbott, M.B., 2001. Model induction with support vector machines: introduction and applications. J. Comput. Civ. Eng. 15, 208e216. https://doi. org/10.1061/(ASCE)0887-3801(2001)15:3(208). Dolling, O.R., Varas, E.A., 2002. Artificial neural networks for streamflow prediction. J. Hydraul. Res. 40, 547e554. https://doi.org/10.1080/00221680209499899. Elganiny, M.A., Eldwer, A.E., 2018. Enhancing the forecasting of monthly streamflow in the main key stations of the river Nile basin. Water Resour. 45, 660e671. https://doi.org/10.1134/ S0097807818050135.

144 Advances in Streamflow Forecasting Fathian, F., Mehdizadeh, S., Sales, A.K., Safari, M.J.S., 2019. Hybrid models to improve the monthly river flow prediction: integrating artificial intelligence and non-linear time series models. J. Hydrol. 575, 1200e1213. https://doi.org/10.1016/j.jhydrol.2019.06.025. Firat, M., Yurdusev, M.A., Turan, M.E., 2009. Evaluation of artificial neural network techniques for municipal water consumption modeling. Water Resour. Manag. 23, 617e632. https://doi. org/10.1007/s11269-008-9291-3. Ghorbani, M.A., Zadeh, H.A., Isazadeh, M., Terzi, O., 2016. A comparative study of artificial neural network (MLP, RBF) and support vector machine models for river flow prediction. Environ. Earth Sci. 75. https://doi.org/10.1007/s12665-015-5096-x. Govindaraju, R.S., Rao, A.R., 2000. Artificial Neural Networks in Hydrology, vol. 36. Water Science and Technology Library. Hashimi, S.A., Dalkilic¸, H.Y., 2020. Prediction of daily streamflow using artificial neural networks (ANNs), wavelet neural networks (WNNs), and adaptive neuro-fuzzy inference system (ANFIS) models. Water Supply 20, 1396e1408. https://doi.org/10.2166/ws.2020.062. He, Z.B., Wen, X.H., Liu, H., Du, J., 2014. A comparative study of artificial neural network, adaptive neuro fuzzy inference system and support vector machine for forecasting river flow in the semiarid mountain region. J. Hydrol. 509, 379e386. https://doi.org/10.1016/j.jhydrol. 2013.11.054. Honorato, A.G., Silva, G.B., Guimara˜es Santos, C.A., 2019. Monthly streamflow forecasting using neuro-wavelet techniques and input analysis. Hydrol. Sci. J. 63, 2060e2075. https://doi.org/ 10.1080/02626667.2018.1552788. Hotelling, H., 1933. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 417. https://doi.org/10.1037/h0071325. Hsu, K.-l., Gupta, H.V., Sorooshian, S., 1998. Streamflow forecasting using artificial neural networks. In: Proceedings of the 1998 International Water Resources Engineering Conference, vol. 2, pp. 967e972. Hu, T.S., Lam, K.C., Ng, S.T., 2005. A modified neural network for improving river flow prediction. Hydrol. Sci. J. 50, 299e318. https://doi.org/10.1623/hysj.50.2.299.61794. Hu, Y., Yan, L., Hang, T., Feng, J., 2020. Stream-Flow Forecasting of Small Rivers Based on LSTM arXiv preprint arXiv:2001.05681. Hu¨sken, M., Stagge, P., 2003. Recurrent neural networks for time series classification. Neurocomputing 50, 223e235. https://doi.org/10.1016/S0925-2312(01)00706-8. Hussain, A.J., Liatsis, P., Khalaf, M., Tawfik, H., Al-Asker, H., 2018. A dynamic neural network architecture with immunology inspired optimization for weather data forecasting. Big Data Res. 14, 81e92. https://doi.org/10.1016/j.bdr.2018.04.002. Jain, A., Kumar, A.M., 2007. Hybrid neural network models for hydrologic time series forecasting. Appl. Soft Comput. 7, 585e592. https://doi.org/10.1016/j.asoc.2006.03.002. Javan, K., Lialestani, M.R.F.H., Nejadhossein, M., 2015. A comparison of ANN and HSPF models for runoff simulation in Gharehsoo River watershed, Iran. Model. Earth Syst. Environ. 1, 41. https://doi.org/10.4236/ajcc.2015.43016. Jiang, F., Jiang, Y., Zhi, H., Dong, Y., Li, H., Ma, S., Wang, Y., Dong, Q., Shen, H., Wang, Y., 2017. Artificial intelligence in healthcare: past, present and future. Stroke Vasc. Neurol. 2, 230e243. https://doi.org/10.1136/svn-2017-000101. Kagoda, P.A., Ndiritu, J., Ntuli, C., Mwaka, B., 2010. Application of radial basis function neural networks to short-term streamflow forecasting. Phys. Chem. Earth, Parts A/B/C 35, 571e581. https://doi.org/10.1016/j.pce.2010.07.021.

Concepts, procedures, and applications of artificial Chapter | 4

145

Kim, M., Baek, S., Ligaray, M., Pyo, J., Park, M., Cho, K.H., 2015. Comparative studies of different imputation methods for recovering streamflow observation. Water 7, 6847e6860. https://doi.org/10.3390/w7126663. ¨ ., 2010. Generalized regression neural networks for evapotranspiration modelling. Hydrol. Kis¸i, O Sci. J. 51, 1092e1105. https://doi.org/10.1623/hysj.51.6.1092. Kohonen, T., 1982. Analysis of a simple self-organizing process. Biol. Cybern. 44 (5), 135e140. Kohonen, T., Somervuo, P., 1998. Self-organizing maps of symbol strings. Neurocomputing 21, 19e30. https://doi.org/10.1016/S0925-2312(98)00031-9. Kotsiantis, S.B., Kanellopoulos, D., Pintelas, P.E., 2006. Data preprocessing for supervised leaning. Proc. World Acad. Sci. Eng. Technol. 12 (12), 278e283. Kourgialas, N.N., Dokou, Z., Karatzas, G.P., 2015. Statistical analysis and ANN modeling for predicting hydrological extremes under climate change scenarios: the example of a small Mediterranean agro-watershed. J. Environ. Manag. 154, 86e101. https://doi.org/10.1016/j. jenvman.2015.02.034. Lee, D.H., Kang, D.S., 2016. The application of the artificial neural network ensemble model for simulating streamflow. In: 12th International Conference on Hydroinformatics (HIC 2016) Smart Water for the Future, vol. 154, pp. 1217e1224. Lee, K.D., Booth, D., Alam, P., 2005. A comparison of supervised and unsupervised neural networks in predicting bankruptcy of Korean firms. Expert Syst. Appl. 29, 1e16. https://doi.org/ 10.1016/j.eswa.2005.01.004. Li, P.H., Kwon, H.H., Sun, L.Q., Lall, U., Kao, J.J., 2010. A modified support vector machine based prediction model on streamflow at the Shihmen Reservoir, Taiwan. Int. J. Climatol. 30, 1256e1268. https://doi.org/10.1002/joc.1954. Li, Y., Chao, X., 2020. ANN-based continual classification in agriculture. Agriculture 10, 178. Maier, H.R., Jain, A., Dandy, G.C., Sudheer, K.P., 2010. Methods used for the development of neural networks for the prediction of water resource variables in river systems: current status and future directions. Environ. Model. Software 25, 891e909. https://doi.org/10.1016/j. envsoft.2010.02.003. Markopoulos, A.P., Georgiopoulos, S., Manolakos, D.E., 2016. On the use of back propagation and radial basis function neural networks in surface roughness prediction. J. Indus. Eng. Int. 12, 389e400. https://doi.org/10.1007/s40092-016-0146-x. McNelis, P.D., 2005. Neural Networks in Finance: Gaining Predictive Edge in the Market. Academic Press Advanced Finances Series. Mhaskar, H.N., 1993. Approximation properties of a multilayered feedforward artificial neural network. Adv. Comput. Math. 1, 61e80. Mutlu, E., Chaubey, I., Hexmoor, H., Bajwa, S.G., 2008. Comparison of artificial neural network models for hydrologic predictions at multiple gauging stations in an agricultural watershed. Hydrol. Process. 22, 5097e5106. https://doi.org/10.1002/hyp.7136. Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through conceptual models part I d a discussion of principles. J. Hydrol. 10, 282e290. https://doi.org/10.1016/0022-1694(92) 90146-M. Nawi, N.M., Atomi, W.H., Rehman, M.Z., 2013. The effect of data pre-processing on optimized training of artificial neural networks. In: 4th International Conference on Electrical Engineering and Informatics (ICEEI 2013), vol. 11, pp. 32e39. Nguyen, H., Bui, X.-N., Bui, H.-B., Mai, N.-L., 2020. A comparative study of artificial neural networks in predicting blast-induced air-blast overpressure at Deo Nai open-pit coal mine. Vietnam Neural Comput. Applicat. 32, 3939e3955. https://doi.org/10.1007/s00521018-3717-5.

146 Advances in Streamflow Forecasting Noori, R., Karbassi, A.R., Moghaddamnia, A., Han, D., Zokaei-Ashtiani, M.H., Farokhnia, A., Gousheh, M.G., 2011. Assessment of input variables determination on the SVM model performance using PCA, Gamma test, and forward selection techniques for monthly stream flow prediction. J. Hydrol. 401, 177e189. https://doi.org/10.1016/j.jhydrol.2011.02.021. Poul, A.K., Shourian, M., Ebrahimi, H., 2019. A comparative study of MLR, KNN, ANN and ANFIS models with wavelet transform in monthly stream flow prediction. Water Resour. Manag. 33 (8), 2907e2923. https://doi.org/10.1007/s11269-019-02273-0. Ravansalar, M., Rajaee, T., Kis¸i, O., 2017. Wavelet-linear genetic programming: a new approach for modeling monthly streamflow. J. Hydrol. 549, 461e475. https://doi.org/10.1016/j.jhydrol. 2017.04.018. Sahoo, A., Samantaray, S., Ghose, D.K., 2019. Stream flow forecasting in mahanadi river basin using artificial neural networks. Procedia Comput. Sci. 157, 168e174. https://doi.org/10.1016/ j.procs.2019.08.154. Seif, A., Mokarram, M., Sathyamoorthy, D., 2014. Using self-organizing maps for alluvial fan classification. Int. J. Sci. Res. Knowl. 2, 189e198. https://doi.org/10.12983/ijsrk-2014-p01890198. Shirgure, P., 2013. Evaporation modeling with artificial neural network: a review. Sci. J. Rev. 2, 73e84. Shoaib, M., Shamseldin, A.Y., Melville, B.W., M.M., K., 2016. Hybrid Wavelet Neural Network Approach. Artificial Neural Network Modelling. Springer, Cham, p. 628. Sinha, N.K., Gupta, M.M., Rao, D.H., 2000. Dynamic neural networlks: an overview. Proc. IEEE Int. Confer. Indus. Technol. 1, 491e496. Sivapragasam, C., Liong, S.Y., 2005. Flow categorization model for improving forecasting. Nordic Hydrol. 36, 37e48. https://doi.org/10.2166/nh.2005.0004. Somervuo, P., Kohonen, T., 1999. Self-organizing maps and learning vector quantization for feature sequences. Neural Process. Lett. 10, 151e159. https://doi.org/10.1023/A: 1018741720065. Specht, D.F., 1991. A general regression neural network. IEEE Trans. Neural Network. 2, 568e576. https://doi.org/10.1109/72.97934. Sudheer, C., Anand, N., Panigrahi, B.K., Mathur, S., 2013. Streamflow forecasting by SVM with quantum behaved particle swarm optimization. Neurocomputing 101, 18e23. https://doi.org/ 10.1016/j.neucom.2012.07.017. Sukor, A.S.A., Zakaria, A., Rahim, N.A., Kamarudin, L.M., Setchi, R., Nishizaki, H., 2019. A hybrid approach of knowledge-driven and data-driven reasoning for activity recognition in smart homes. J. Intell. Fuzzy Syst. 36, 4177e4188. https://doi.org/10.3233/JIFS-169976. Terzi, O., Ergin, G., 2014. Forecasting of monthly river flow with autoregressive modeling and data-driven techniques. Neural Comput. Appl. 25, 179e188. https://doi.org/10.1007/s00521013-1469-9. Toth, E., Brath, A., 2007. Multistep ahead streamflow forecasting: role of calibration data in conceptual and neural network modeling. Water Resour. Res. 43. https://doi.org/10.1029/ 2006WR005383. Trevor, H., Robert, T., Friedman, J.H., 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Tsoi, A.C., Back, A., 1995. Static and dynamic preprocessing methods in neural networks. Eng. Appl. Artif. Intell. 8, 633e642. https://doi.org/10.1016/0952-1976(95)00047-X. Tyralis, H., Papacharalampous, G., Langousis, A., 2019. A Brief Review of Random Forests for Water Scientists and Practitioners and Their Recent History Inwater Resources. Water (Switzerland), p. 11. https://doi.org/10.3390/w11050910.

Concepts, procedures, and applications of artificial Chapter | 4

147

Wan, E.A., 1994. Time-series prediction by using a connectionist network with internal delaylines. In: Time Series Prediction: Forecasting the Future and Understanding the Past, vol. 15, pp. 195e217. Wu, C., Chau, K., Fan, C., 2010. Prediction of rainfall time series using modular artificial neural networks coupled with data-preprocessing techniques. J. Hydrol. 389, 146e167. https://doi. org/10.1016/j.jhydrol.2010.05.040. Wu, J.S., Han, J., Annambhotla, S., Bryant, S., 2005. Artificial neural networks for forecasting watershed runoff and stream flows. J. Hydrol. Eng. ASCE 10 (3), 216e222. https://doi.org/10. 1061/(ASCE)1084-0699(2005)10:3(216). Wu, Y., Wang, H., Zhang, B., Du, K.L., 2012. Using radial basis function networks for function approximation and classification. ISRN Appl. Mathemat. 324194, 2012. Xia, F., Fan, L., 2012. Application of Artificial Neural Network (ANN) for Prediction of Power Load. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 673e677. Yang, Z.R., Yang, Z., 2014. 6.01 - artificial neural networks. In: BRAHME, A. (Ed.), Comprehensive Biomedical Physics. Elsevier, Oxford. Yaseen, Z.M., El-Shafie, A., Afan, H.A., Hameed, M., Mohtar, W.H.M.W., Hussain, A., 2016. RBFNN versus FFNN for daily river flow forecasting at Johor River, Malaysia. Neural Comput. Appl. 27, 1533e1542. https://doi.org/10.1007/s00521-015-1952-6. Yaseen, Z.M., El-shafie, A., Jaafar, O., Afan, H.A., Sayl, K.N., 2015. Artificial intelligence based models for stream-flow forecasting: 2000e2015. J. Hydrol. 530, 829e844. https://doi.org/10. 1016/j.jhydrol.2015.10.038. Yonaba, H., Anctil, F., Fortin, V., 2010. Comparing sigmoid transfer functions for neural network multistep ahead streamflow forecasting. J. Hydrol. Eng. ASCE 15, 275e283. https://doi.org/ 10.1061/(ASCE)HE.1943-5584.0000188. Zhang, H.B., Singh, V.P., Bin Wang, B., Yu, Y.H., 2016. CEREF: a hybrid data-driven model for forecasting annual streamflow from a socio-hydrological system. J. Hydrol. 540, 246e256. https://doi.org/10.1016/j.jhydrol.2016.06.029. Zhu, S., Luo, X., Yuan, X., Xu, Z., 2020. An improved long short-term memory network for streamflow forecasting in the upper Yangtze River. Stoch. Environ. Res. Risk Assess. 1e17. https://doi.org/10.1007/s00477-020-01766-4.

Chapter 5

Application of different artificial neural network for streamflow forecasting Md Manjurul Hussain1, Sheikh Hefzul Bari2, Ishtiak Mahmud3, Mohammad Istiyak Hossain Siddiquee4 1

Institute of Water and Flood Management, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh; 2Department of Civil Engineering, Leading University, Sylhet, Bangladesh; 3Civil and Environmental Engineering, Shahjalal University of Science and Technology, Sylhet, Bangladesh; 4Data and Knowledge Engineering, Otto-von-Guericke University of Magdeburg, Magdeburg, Saxony-Anhalt, Germany

5.1 Introduction The term “streamflow” can be referred to as the volume of water flowing through a river or a channel. Streamflow forecasting is a very vital part of water resource management and controlling system. A good forecasting model enables efficient operation of water resources planning and management and includes but not limited to providing water for irrigation and household activities, hydropower planning, drought analysis, flood management, etc. Streamflow is a very complex hydrological process as it is controlled by many climatic, geologic, and physical factors, which make it a challenging task to develop an accurate streamflow forecasting model. Besides, collecting too many variables and putting all these in a model is often cumbersome and inconvenient. In such case, forecasting streamflow using historic flow data has become a convenient option for many researchers and practitioners. Streamflow forecasts, with a lead time of hours or days, are generally used for realtime operation of water resources systems and flood warning purposes. Again, forecasts with lead times of weeks, months, and years are generally used for the water resources planning and management (Govindaraju and Rao, 2013). There are usually three approaches used for forecasting of a hydrological time series, viz., empirical methods, geomorphological methods, and physical models (ASCE Task Committee on Application of Artificial Neural Networks in Hydrology, 2000). Empirical models are often black-box type owing to their unknown underlying process and complexity. These model developments are Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00006-8 Copyright © 2021 Elsevier Inc. All rights reserved.

149

150 Advances in Streamflow Forecasting

also often region-specific. In the recent past, several empirical models have been developed and proposed for time series forecasting (Salas and Smith, 1981). The second type of method is the geomorphology-based model, which is an improvement over the empirical method (Gupta and Waymire, 1983). The third type of model is the physical-based model, which can be efficiently developed if the physical characteristics of the watershed are known at the model grid scale (ASCE Task Committee on Application of Artificial Neural Networks in Hydrology, 2000). Often, these kinds of data are rarely available in the real world. They also require historical data for model calibration. Therefore, the paradigm of the best model, however, remained unsolved. This leads scientists to seek easy to implement but an efficient alternative solution for forecasting various hydrological processes. In this sense, the artificial neural network (ANN) has become one of the most popular and extensively used models to forecast hydrological processes. The ANNs are inspired by the biological neural networks. According to the above classification, the ANN can be labeled as an empirical model. Like most of the empirical models, ANN is not based on statistical relation. Rather, neurons in ANN learn from experience as seen in the human brain. A detailed discussion about the ANN structure is given later in this chapter. The ANN technique of forecasting is generally referred to as a black-box model due to its complex and unknown underlying process (Sahoo et al., 2019). It is capable of mapping the nonlinear behavior of hydrological datasets and making predictions without describing the physical characteristics of the process (Kis¸i and Sanikhani, 2015; Uysal et al., 2016). These characteristics of ANN make the technique well suited for forecasting and modeling hydrological processes.

5.2 Development of neural network technique The history of the ANN is very old. In the late 1940s, Canadian psychologist Donald O. Hebb (Hebb, 1949) proposed a learning hypothesis based on the mechanism of neuroplasticity. Meanwhile, many researchers were working on the same issue. In 1958, Rosenblatt created an algorithm for pattern recognition called perceptron (Rosenblatt, 1958). Perceptron is mainly single neuron architecture, which is considered as the simplest type of ANN. However, the first multilayer ANN was proposed by Ivakhnenko and Lapa (Ivakhnenko and Lapa, 1966; Schmidhuber, 2015). In 1975, Werbos came with a revolutionary idea named backpropagation algorithm, an excellent error minimization (or learning) algorithm to reduce multilayer neural network error (Werbos, 1975). Kunihiko Fukushima (Fukushima, 1980) introduced “neocognitron,” which later inspired convolutional neural networks (CNNs). Another new type of neural network, Hopfield network, was developed by John Hopfield (Hopfield, 1982), which later contributed to creating one of the most effective networks, the recurrent neural network (RNN).

Application of different artificial neural network Chapter | 5

151

To date, the progress of the network was mainly confined to the development of several types of networks only. The first breakthrough happened in 1986 when David Rumelhart, Geoff Hinton, and Ronald J. Williams applied the backpropagation algorithm to reduce multilayer network error (Rumelhart et al., 1986). One year later in 1987, the time delay neural network was introduced by Alex Waibel (Waibel, 1987). It used the shift invariance approach; therefore, it can be considered as the first CNN. In 1997, Sepp Hochreiter and Ju¨rgen Schmidhuber invented long short-term memory (LSTM) RNNs, considerably improving the efficiency and practicality of RNNs, which is the most effective network for sequential data (Hochreiter and Schmidhuber, 1997). In the meantime, a lot of improvements were established in learning algorithms (see Almeida et al., 1997; Battiti, 1989; Jacobs, 1988; Lapedes and Farber, 1986; LeCun et al., 1993; Neuneier and Zimmermann, 1998; Silva and Almeida, 1990; Vogl et al., 1988; Yu et al., 1995). Hinton et al. (2006) proposed a new fast learning algorithm that presented how a multilayered feedforward neural network can be effectively pretrained by one layer at a time, treating each layer in turn as an unsupervised restricted Boltzmann machine (Smolensky, 1986), and then it is fine-tuned using supervised backpropagation. To reduce the overfitting problem in the neural network, the “Dropout Regularization” technique was proposed by Hinton et al. (2012). Another outstanding neural network named generative adversarial network (GAN) was invented by Goodfellow et al. (2014), which is used to generate new data such as images or music. A lot of different types of neural networks have been invented till today (Fig. 5.1). Most of the neural network models are mainly developed and evolved to solve complex problems like classification and object detection from images or videos, understanding voice commands, etc. Nevertheless, many are also used in time series forecasting (e.g., precipitation, temperature, streamflow, or stock market prediction). Among them, the most commonly used neural network models for time series forecasting are described below.

FIGURE 5.1 Timeline of artificial neural network development.

152 Advances in Streamflow Forecasting

5.2.1 Multilayer perceptron Multilayer perceptron (MLP), also known as feedforeword neural network (FFNN), is the most common neural network architecture used in time series as well as streamflow forecasting (Gupta et al., 2000). Although it is popularly known as MLP, the model is essentially multiple layers of logistic regression models that have continuous nonlinearities rather than perceptrons with discrete values (Bishop, 2006). The MLP consists of three layers, viz., input layer, hidden layer, and output layer. Every layer has one or more neurons, and every neuron is connected with every neuron of the next layer (Fig. 5.2). These types of layers are called fully connected layers or dense layers. The input neurons receive data and multiply with random weights. Weighted data are then passed to the next layer. The very fact that input layer only passes the data without being part of any calculation related to the learning part of the algorithm led many authors to argue about its status as a layer of MLP (Bishop, 2006). However, every neuron of the next layer receives the result, adds bias, and then passes it to an activation function. Activation functions are used to detect nonlinearity in the inputeoutput relationship (Viswanath et al., 2019). Various types of functions can be used as an activation function. The most popular functions are tanh, sigmoid, purelin (linear function), and Rectified Linear Unit (ReLU) (Table 5.1). ReLU is new in comparison to other activation functions, but its popularity has increased rapidly (Ge´ron, 2019; Nair and Hinton, 2010), owing to its faster learning speed and the absence of the vanishing gradient problem (Ge´ron, 2019; Ide and Kurita, 2017). Activation functions help to capture complex relationship between input and output by propagating data from one layer to other layer, which finally reach to the output layer. The results of the output layer are compared with the known target data, which should be the output from the model. The resulting error is then minimized by updating weights using the backpropagation algorithm.

FIGURE 5.2 A typical multilayer network.

Application of different artificial neural network Chapter | 5

153

TABLE 5.1 List of activation functions. Activation function

Equation

Linear

f ðxÞ ¼ x

Hyperbolic tangent (tanh)

f ðxÞ ¼ ee x e þe x

Sigmoid (logistic)

1 f ðxÞ ¼ 1þe x

Rectified Linear Unit (ReLU)

f ðxÞ ¼

Sign

x

(

Graph

x

0;

x > < 1; < 0 f ðxÞ ¼ 0; x ¼ 0 > > : 1; x > 0

Nowadays, a number of variants of backpropagation algorithms, such as gradient descent, momentum optimization, Nesterov Accelerated Gradient, AdaGrad, RMSProp, Adam, Nadam optimization, etc., are available (Ge´ron, 2019). However, Adam optimization is the most popular algorithm used in recent time. The learning capacity of a neural network depends on the number of neurons and layers. Number of neurons in input and output layers depends on the number of input and output parameters. But there are no rules for selecting number of hidden layers and neurons in those hidden layers. An increase in number of hidden layers or neurons in the hidden layer enhances complexity of the model. A complex neural network model is appropriate to capture the complex relationship between the input and output layers, although it has some

154 Advances in Streamflow Forecasting

drawbacks also. A complex network model is hard to train and has greater chances to overfit. On the contrary, insufficient number of neurons or hidden layers may make the model underfit. Therefore, it is important to use the appropriate number of neurons and hidden layers to get the best architecture of the model. Like other machine learning models, the standard practice for modeling an ANN is to split the data into three parts: (1) training dataset, (2) validation dataset, and (3) testing dataset (Hastie et al., 2009). Traditionally, the training dataset consists of 70% of the total dataset, and each 15% of the dataset is used for validation and testing, respectively. The idea is to use training dataset to train the ANN model and validation dataset to keep an eye on the training so that the model does not overfit, whereas testing dataset is used to evaluate the model’s performance in generalizing beyond training data. Numerous goodness-of-fit criteria, namely root mean square error (RMSE), mean square error, NasheSutcliffe efficiency etc. are used to evaluate the performance of a model (Ge´ron, 2019). Initially, a number of models using a different combination of neurons and layers are selected. If the performance of a model is good for training dataset and not good for validation dataset, then the model is considered as overfitted. If model’s performance is not good for both training and validation datasets, then the model is assumed to be underfitted. Comparing the results of all the selected models, an optimum model is chosen that performs well for both training and validation datasets (Ge´ron, 2019) (Fig. 5.3).

FIGURE 5.3 Application-oriented schematic diagram for machine learning workflow.

Application of different artificial neural network Chapter | 5

155

5.2.2 Recurrent neural network One of the major drawbacks of the MLP is that it cannot handle sequential data directly. This problem is resolved using RNNs where data are fed to an RNN model in a sequential manner. Thus, every data in the time series of a parameter depends on the preceding data (Kumar et al., 2019). The RNN works similar to MLP, except that neurons in RNN not only send data to the next layer but also feed data into its own layer (Ge´ron, 2019). Therefore, the preceding data remain present in the layer. Hence, the name is “recurrent” neural network. These self-fed neurons are called memory cell. Due to the model structure, the output of the network depends on both the recent and previous data (Fig. 5.4) (Kumar et al., 2004).

5.2.3 Long short-term memory network RNN uses backpropagation through time (BPTT) (Werbos, 1990) as the backbone to train the model. BPTT faces serious issues if the sequence is considerably large. Because the gradients calculated through this algorithm need to be passed down from the end of the sequence to the beginning, as this requires a lot of fractional multiplication, the gradient value tends to become smaller as the sequence length increases. This essentially cripples the model by restricting the length of the input sequence. To solve this problem, Hochreiter and Schmidhuber (1997) proposed a new type of RNN, named LSTM. The main difference between the RNN and LSTM is in the memory cell. On inspecting an unrolled LSTM cell, it can be seen that two main

FIGURE 5.4 Different types of memory cell of recurrent neural network. Modified after Olah, C., 2015. Understanding LSTM Networks. Retrieved from: http://colah.github.io/posts/2015-08Understanding-LSTMs.

156 Advances in Streamflow Forecasting

streams pass cell to cell (Fig. 5.4), where h(t) carries short-term memories and c(t) carries the long-term memories (Ge´ron, 2019). This ensures easy flow of gradients, which in turn allows LSTM to memorize longer sequences as good as shorter sequences.

5.2.4 Gated recurrent unit The gated recurrent unit (GRU) is a simplified version of the LSTM model. This model, proposed by Cho et al. (2014), mainly combines both short-term and long-term memory streams into a single stream using a gate (Fig. 5.4). Although it is a simpler version of the LSTM, it performs reasonably well (Greff et al., 2016).

5.2.5 Convolutional neural network CNN is slightly different than other neural network models. CNN is usually used where the spatial arrangement of the data is significant and contains valuable information for the model. It is currently the most widely used network for object (or image) recognition and classification. The CNN consists of two new types of layers: one is the convolution layer and another is the pooling layer (Ge´ron, 2019; Kavitha and Srimathi, 2019; Pratt et al., 2019). These two layers are mainly used for extracting and processing important features from an image (Pratt et al., 2019). The convolution layer uses a set of filter or kernel to slide over the input images. Normally, this filter or kernel is a 2D (for one-layer image, i.e., gray-scale image) or 3D (for multilayer image, i.e., RGB image having three layers) matrix, but 1D kernel is also used for vector data as well as time series data. Besides, there has been other, rather ingenious, proposals of using 1D kernel; for example, as discussed in “Network in Network paper” (Lin et al., 2013), 1D kernel has been used as the fundamental building block for the inception network (Szegedy et al., 2015). This network won the ImageNet challenge in multiple categories. However, after using different types of kernel over the image, results are sent to the pooling layer. The goal of pooling layer is to subsample (i.e., shrink) the input image in order to reduce the computational load, the memory usage, and the number of parameters as well as reduce the risk of overfitting (Ge´ron, 2019). Using the desired number of convolution and pooling layers, the result is sent to a fully connected or dense layer and then to the output layer (Fig. 5.5).

5.2.6 WaveNet The main idea of WaveNet (Oord et al., 2016) was to design an autoregressive model, in which, statistically speaking, each predictive distribution is conditioned on all previous observations. To achieve this, the authors employed causal convolution technique, whereby part of the input sequence is masked in

Application of different artificial neural network Chapter | 5

157

FIGURE 5.5 A typical convolutional neural network architecture. Modified after Zhou, K., Zheng, Y., Li, B., Dong, W., Zhang, X., 2019. Forecasting different types of convective weather: a deep learning approach. J. Meteorol. Res. 33 (5), 797e809.

order to blind the input from seeing it. This ensures that the model cannot look ahead in time. Causal convolution has an issue with layer size. In order to increase the receptive field of the model, the required number of layers are much higher than regular CNN. To tackle this challenge, authors used dilated convolution with varying dilation size. Dilated convolution has an added benefit of unchanged dimension of input along the layers, which we cannot have if we use pooling in place of dilation (Ge´ron, 2019; Wan et al., 2019; Wang et al., 2019) (Fig. 5.6). Then, the result of the dilated convolution layer is sent to the dense layer or directly to the output layer as required.

5.3 Artificial neural network in streamflow forecasting In recent years, ANNs have become popular among engineers and scientists for modeling hydrological processes including streamflow. Karunanithi et al. (1994) used a neural network with cascade correlation algorithm for flow forecasting in the Harun River at the Dexter sampling station in southeast Michigan, USA. They also performed an empirical comparison between neural

FIGURE 5.6 A typical WaveNet architecture (every circle represents a neuron).

158 Advances in Streamflow Forecasting

networks and commonly used nonlinear power model. The study found that the result of ANNs was quite encouraging. Markus et al. (1995) forecasted monthly streamflow in the Rio Grande Basin of the United States and compared neural network with periodic transfer function models. The result showed that forecast biases were almost same for both of the methods and RMSE was smaller for neural network. Poff et al. (1996) used the ANN to evaluate the hydrological change of two streams in the northeastern United States. They used historical temperature and precipitation data as input to the ANN. Muttiah et al. (1997) used the cascade correlation algorithm to predict 2-year peak discharge for major river regional basins of the continental United States. They found that the ANN model shows an improvement over the traditional regression techniques. A comparative study among the autoregressive moving average model, the ThomaseFiering model, and the ANN model using synthetic streamflow data for Pagladiya River in India showed that the ANN performed better than the other two models (Ahmed and Sarma, 2007). Kis¸i (2009) found that the neuro-wavelet model significantly increased the accuracy of the ANN model in forecasting daily streamflow in the Ergene River and the Seytan Stream in Turkey. The ANN is also used for the development of streamflow rating curve at the Nile River in Sudan (Tawfik et al., 1997). Xu et al. (2009) used ANN and physical-based model TOPMODEL for daily streamflow forecasting in the Baohe River Basin, China. The focus of the study was on evaluating comparative performances of the ANN model and the TOPMODEL. They found that the ANN model performed better than the physical-based model TOPMODEL. A coupled wavelet transform and neural network method was used by Adamowski and Sun (2010) for forecasting flow of the Xeros River at Lazarides and the Kargotis River at Evryvhouin Cyprus. Results of the study revealed that the coupled wavelet neural network forecasted streamflow more accurately than the regular ANN model. Lohani et al. (2012) performed a comparative evaluation between autoregressive (AR), ANN, and adaptive neural-based fuzzy inference system (ANFIS) for forecasting of monthly reservoir inflow in the Sutlej River at Bhakra Dam, India. The study reported that the fuzzy inference-based neural model ANFIS provided a comparatively better result than that produced from the ANN and AR models. Adnan et al. (2017) used the ANN and the support vector machine (SVM) to investigate the ability of these two data-driven models for predicting monthly streamflow in the Dainyor hydraulic station in the Indus River in Pakistan. They found that both of the models were useful in predicting monthly streamflow. Several other researchers also used the ANN in streamflow forecasting (Kis¸i, 2004, 2005; Yaseen et al., 2015; Uysal et al., 2016; Adnan et al., 2017; Reza et al., 2018; Freire et al., 2019; Rahmani-Rezaeieh et al., 2020; Sun et al., 2019). It has been seen from the literature that most of the studies forecasted streamflow using MLP-based ANN models. Recently, some other types of

Application of different artificial neural network Chapter | 5

159

ANN models, such as LSTM (Le et al., 2019; Sahoo et al., 2019), CNN (Pulido-Calvo and Gutierrez-Estrada, 2009; Rehman et al., 2019), etc., are also employed in streamflow forecasting. Yan et al. (2019) used the LSTM and SVM to forecast small watershed streamflow in China and found that the LSTM has a great potential in streamflow forecasting. Moreover, several other ANN models such as GRU and WaveNet are also used recently in time series forecasting (Sit and Demir, 2019; Wan et al., 2019; Wang et al., 2019), but those are rarely used in streamflow forecasting till now.

5.4 Application of ANN: a case study of the Ganges River A case study is being presented here where relative performance of six different types of neural network models is evaluated by applying the models for forecasting streamflow of the Ganges River. The Ganges, also known as the Ganga (in India) and the Padma (in Bangladesh), is one of the largest river systems in the world. It is running for more than 2500 km from the Himalayas to the Bay of Bengal through India and Bangladesh. In the pathway, Ganges carries 300  109 m3 of water and 520 Mt of sediments annually (Khan et al., 2019). The streamflow discharge data (1969e2018) for the Ganges River recorded at Hardinge Bridge point of Bangladesh were collected from the Bangladesh Water Development Board (BWDB). At first, collected data were converted into weekly streamflow discharge to maintain consistency in time series. The frequency of data was irregular and there were many missing values. Therefore, the number of missing values was counted and presence of inconsistent data was checked (Fig. 5.7). Finally, weekly streamflow data of 18-year period (2001e18) were selected to use in this study. The missing values (about 5%) in the selected dataset were filled up using linear interpolation technique as the ANN models cannot handle missing values. Then the final data were used for model building (Fig. 5.8). In the univariate (one variable) neural network model, the main assumption is that data are significantly influenced by the past data at certain time lags

FIGURE 5.7 Temporal variation of Ganges river discharge in Hardinge Bridge point.

160 Advances in Streamflow Forecasting

FIGURE 5.8 Monthly boxplot of Ganges river discharge in Hardinge Bridge point.

(Toth and Brath, 2007). Hence, the autocorrelation function (ACF) plot was used to determine the number of significant time lags. The ACF plot (Fig. 5.9) shows that there is a significant correlation in river flow data up to eight lags. Therefore, eight time lags were considered as input in ANN models. After selecting appropriate time lags, six types of neural network models were developed for streamflow forecasting using training and validation datasets. The models considered in this study are MLP or FFNN, RNN, LSTM, GRU, CNN, and WaveNet. At first, four models (MLP, RNN, LSTM, and GRU) were developed with one hidden layer. The performance of these single hidden layer models was then checked by using different number of neurons (5, 10, 15, 20, 25, and 30) in the hidden layer. Similarly, the performance of the CNN model was checked with 1 and 2 convolutions and pooling layers and the performance of the WaveNet model was checked with 1, 2, 3, 4, and 5 dilated convolutions layer with 2, 4, 8, 16, and 32 dilation rate. Details of the input parameters for the six models along with values of their performance evaluation criteria are summarized in Table 5.2. It can be seen that 20 neurons in hidden layer for MLP, 25 memory cells for RNN, 10 LSTM cells for LSTM

FIGURE 5.9 Autocorrelation function plot for discharge data.

Application of different artificial neural network Chapter | 5

161

TABLE 5.2 Performance of six different artificial neural network models with different number of neurons in first hidden layer.

Model

Neuron No. in first hidden layer

MLP

Training set RMSE

Validation set RMSE

5

5462.229

6042.882

MLP

10

4760.309

4938.453

MLP

15

4516.039

4622.974

MLP

20

4538.944

4420.16

MLP

25

4480.836

4589.513

MLP

30

4534.024

4693.206

RNN

5

17,066.69

16,522.93

RNN

10

17,066.69

16,522.93

RNN

15

17,066.69

16,522.93

RNN

20

17,066.69

16,522.93

RNN

25

4305.522

4990.427

RNN

30

4384.532

5420.101

5

17,260.8

16,659.07

LSTM

10

5058.705

5298.559

LSTM

15

17,066.01

16,522.3

LSTM

20

5708.75

6333.627

LSTM

25

7552.782

6790.001

LSTM

30

6533.871

7448.596

GRU

5

11,377.85

10,564.58

GRU

10

17,066.01

16,522.3

GRU

15

4777.475

4789.988

GRUa

20

4742.919

4713.296

GRU

25

5314.868

5586.754

30

4676.894

4829.303

a

a

LSTM a

GRU a

Convolution layer

Dilation rate

CNN

1

6644.219

8053.641

CNN

2

6717.886

8032.193

WaveNet

1

4488.367

5042.545

2

Continued

162 Advances in Streamflow Forecasting

TABLE 5.2 Performance of six different artificial neural network models with different number of neurons in first hidden layer.dcont’d Neuron No. in first hidden layer

Convolution layer

Dilation rate

Training set RMSE

Validation set RMSE

2

4

4386.315

5208.171

WaveNet

3

8

4384.481

4872.404

WaveNet

4

16

4628.841

5391.81

WaveNet

5

32

4470.203

5063.5

Model WaveNet a

a

Bold lines indicate the best model from each category.

model, 20 GRU cells for GRU model, 1 convolution layer and pooling layer for CNN, and 3 dilated convolution layers for WaveNet performed the best in each types of model categories. It may also be noted that the CNN model performed the worst, and hence, this model was discarded from the secondstep assessment. In the second step of analysis, a second hidden layer was introduced in each type of model to improve the model performance. In this step, the optimum number of neurons for the first hidden layer was selected based on the first step of analysis. Only number of neurons in the second hidden layer was subjected to trial and error to find out the optimum number of neurons in second hidden layer. This second-step process was completed by following the same procedure that was adopted in the first step of analysis. The only change in the second step was adding one extra dense layer after dilated convolution layers in WaveNet model instead of adding a dilated convolution layer. The number of neurons for this dense layer was calculated as was done previously. The results are shown in Table 5.3. It can be seen from the table that the MLP with 15 neurons in the second hidden layer can be selected as the best-fit model among these six types of ANN models. Therefore, the structure of the final best-fit model is 8e20e15e1 (inputefirst hidden layeresecond hidden layereoutput). The best-fit model was selected based on the training and validation dataset only. Hence, validation of the selected best-fit model is essential using the testing dataset to see how the developed best-fit model performs for the unused dataset. Value of the correlation coefficient between the observed and predicted discharge is found to be more than 93% (Fig. 5.10). Besides, time series plot of observed and predicted data shows a good similarity to each other (Fig. 5.11). Therefore, it is concluded that ANN-based predictions for all training, validation, and testing datasets are almost similar to observed discharge data.

Application of different artificial neural network Chapter | 5

163

TABLE 5.3 Performance of six different artificial neural network models with different number of neurons in second hidden layer. Model

Neuron No. in first hidden layer

Training set RMSE

Validation set RMSE

MLP

5

4579.359

4864.14

MLP

10

4526.895

4569.32

MLP

15

4509.161

4492.958

MLP

20

4740.721

4567.138

MLP

25

4496.44

4578.662

MLP

30

4556.889

4567.969

RNN

5

4601.588

5711.661

RNN

10

4561.04

5336.098

RNN

15

17,066.69

16,522.93

RNN

20

4479.707

5271.097

RNN

25

4505.346

5259.549

RNN

30

4180.546

5179.065

LSTM

5

17,066.01

16,522.3

LSTM

10

6994.489

7347.332

LSTM

15

7635.621

6757.997

LSTM

20

13,467.3

13,681.73

LSTM

25

17,066.01

16,522.3

LSTM

30

6624.648

7430.409

GRU

5

5332.921

5653.094

GRU

10

17,066.01

16,522.3

GRU

15

6294.408

6777.499

GRU

20

4458.701

4669.333

GRU

25

4532.825

5271.684

GRU

30

4662.329

4739.265

WaveNet

5

4280.906

4899.258

WaveNet

10

4239.503

4816.436

WaveNet

15

4320.778

4803.244

WaveNet

20

4283.443

4869.365

a

Continued

164 Advances in Streamflow Forecasting

TABLE 5.3 Performance of six different artificial neural network models with different number of neurons in second hidden layer.dcont’d Model

Neuron No. in first hidden layer

Training set RMSE

Validation set RMSE

WaveNet

25

4218.217

4879.068

WaveNet

30

4076.18

4818.38

a

Bold line indicates the best model.

FIGURE 5.10 Q-Q plot of observed and predicted discharge.

FIGURE 5.11

Comparison of observed and model-predicted streamflow discharge.

Application of different artificial neural network Chapter | 5

165

5.5 ANN application software and programming language There are a variety of paid and free tools available for ANN modeling. Matlab (https://www.mathworks.com/products/matlab.html) Deep Learning Toolbox (formerly known as Neural Network Toolbox) may be the most used paid software. Flexibility of changing and personalization of code made Matlab one of the most popular. The freely available tools for modeling ANN are Python (https://www.python.org/), R (https://www.r-project.org/), and Julia (https://julialang.org/). Python is one of the most used programming languages to implement ANN modeling. Several python packages for ANN are Keras (https://keras.io/), Pytorch (https://pytorch.org/), Theano (http:// www.deeplearning.net/software/theano/), etc. There are several packages available in R (e.g., neuralnet, h2o, nnet, deepnet, rnn, etc.) also. Readers can find the book written by Lewis (2017) interesting for working with time series neural networks in R. Julia is one of the most growing programming languages for data analysis. It also has its own ANN packages such as Flux (https://fluxml.ai/), Knet (https://github.com/denizyuret/Knet.jl), etc. There are also cross-platform libraries such as TensorFlow (https://www.tensorflow. org/) and mxnet (https://mxnet.apache.org/). Both are available in multiple programming languages such as python, R, Javascript, Julia, etc. Also, there are several statistical software packages where ANN tools are available (e.g., Statistica, SPSS, etc.).

5.6 Conclusions ANN models are found performing extremely well in the field of streamflow forecasting. However, there exists further scope to improve the performance of ANN modeling; for example, the efficiency of any ANN-based forecasting model may be substantially improved using multiple input parameters chosen from sensitivity analysis and using hybrid models. Besides, there are many modifications in the existing ANN models, and new algorithms are also developed in recent years. However, despite of improved output, the ANN models cannot provide a clear-cut relationship among interconnected parameters of a hydrological process. From a modeling point of view, a major drawback of the neural networks is that the underlying physical processes or mechanisms are not easily understood whereas statistical or stochastic models can reveal useful information about the series under study.

5.7 Supplementary information The complete analysis for this chapter can be found at the following link: https://git.io/JYBtm.

166 Advances in Streamflow Forecasting

References Adamowski, J., Sun, K., 2010. Development of a coupled wavelet transform and neural network method for flow forecasting of non-perennial rivers in semi-arid watersheds. J. Hydrol. 390, 85e91. https://doi.org/10.1016/j.jhydrol.2010.06.033. Adnan, R.M., Yuan, X., Kis¸i, O., Yuan, Y., 2017. Streamflow forecasting using artificial neural network and support vector machine models. Am. Sci. Res. J. Eng. Technol. Sci. (ASRJETS) 29 (1), 286e294. Ahmed, J.A., Sarma, A.K., 2007. Artificial neural network model for synthetic streamflow generation. Water Resour. Manag. 21 (6), 1015e1029. https://doi.org/10.1007/s11269-006-9070-y. Almeida, L.B., Langlois, T., Amaral, J.D., Redol, R.A., 1997. On-Line Step Size Adaptation. INESC, 9 Rua Alves Redol. p. 1000. ASCE Task Committee, 2000. Artificial neural networks in hydrology. I: preliminary concepts by the ASCE task committee on application of artificial neural networks in hydrology. J. Hydrol. Eng. ASCE 5 (2), 115e123. https://doi.org/10.1061/(ASCE)1084-0699(2000)5:2(115). Battiti, R., 1989. Accelerated backpropagation learning: two optimization methods. Complex Syst. 3, 331e342. Bishop, C.M., 2006. Pattern Recognition and Machine Learning. Springer, New York. p. 738. Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y., 2014. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation arXiv:1406.1078 [cs, stat]. Freire, P.K. de M.M., Santos, C.A.G., da Silva, G.B.L., 2019. Analysis of the use of discrete wavelet transforms coupled with ANN for short-term streamflow forecasting. Appl. Soft Comput. 80, 494e505. https://doi.org/10.1016/j.asoc.2019.04.024. Fukushima, K., 1980. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36, 193e202. https://doi.org/ 10.1007/BF00344251. Ge´ron, A., 2019. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, 2 edition. O’Reilly Media. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y., 2014. Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (Eds.), Advances in Neural Information Processing Systems, vol. 27. Curran Associates, Inc., pp. 2672e2680 Govindaraju, R.S., Rao, A.R., 2013. Artificial neural networks in hydrology, vol. 36. Springer Science & Business Media. p. 332. https://doi.org/10.1007/978-94-015-9341-0. Greff, K., Srivastava, R.K., Koutnı´k, J., Steunebrink, B.R., Schmidhuber, J., 2016. LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28, 2222e2232. https://doi.org/ 10.1109/TNNLS.2016.2582924. Gupta, V.K., Waymire, E., 1983. On the formulation of an analytical approach to hydrologic response and similarity at the basin scale. J. Hydrol. 65, 95e123. https://doi.org/10.1016/ 0022-1694(83)90212-3. Gupta, H.V., Hsu, K., Sorooshian, S., 2000. Effective and efficient modeling for streamflow forecasting. In: Govindaraju, R.S., Rao, A.R. (Eds.), Artificial Neural Networks in Hydrology, Water Science and Technology Library. Springer Netherlands, Dordrecht, pp. 7e22. https:// doi.org/10.1007/978-94-015-9341-0_2. Hastie, T., Tibshirani, R., Friedman, J., 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Science & Business Media. p. 745.

Application of different artificial neural network Chapter | 5

167

Hebb, D.O., 1949. Organization of Behavior. Wiley, New York. p. 321. Hinton, G.E., Osindero, S., Teh, Y.-W., 2006. A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527e1554. https://doi.org/10.1162/neco.2006.18.7.1527. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R., 2012. Improving Neural Networks by Preventing Co-adaptation of Feature Detectors arXiv:1207.0580 [cs]. Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput. 9, 1735e1780. https://doi.org/10.1162/neco.1997.9.8.1735. Hopfield, J.J., 1982. Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. U S A 79, 2554e2558. https://doi.org/10.1073/pnas.79.8.2554. Ide, H., Kurita, T., 2017. Improvement of learning for CNN with ReLU activation by sparse regularization. In: 2017 International Joint Conference on Neural Networks (IJCNN). Presented at the 2017 International Joint Conference on Neural Networks. IJCNN), pp. 2684e2691. https://doi.org/10.1109/IJCNN.2017.7966185. Ivakhnenko, A.G., Lapa, V.G., 1966. Cybernetic Predicting Devices, (No. TR-EE66-5). Purdue Univ Lafayette ind School of Electrical Engineering. Jacobs, R.A., 1988. Increased rates of convergence through learning rate adaptation. Neural Network. 1, 295e307. https://doi.org/10.1016/0893-6080(88)90003-2. Karunanithi, N., Grenney, W.J., Whitley, D., Bovee, K., 1994. Neural networks for river flow prediction. J. Comput. Civ. Eng. 8, 201e220. https://doi.org/10.1061/(ASCE)08873801(1994)8:2(201). Kavitha, B.R., Srimathi, C., 2019. Benchmarking on offline Handwritten Tamil Character Recognition using convolutional neural networks. J. King Saud Univ. Comput. Inf. Sci. https:// doi.org/10.1016/j.jksuci.2019.06.004. Khan, M.H.R., Liu, J., Liu, S., Seddique, A.A., Cao, L., Rahman, A., 2019. Clay mineral compositions in surface sediments of the Ganges-Brahmaputra-Meghna river system of Bengal Basin, Bangladesh. Mar. Geol. 412, 27e36. https://doi.org/10.1016/j.margeo.2019.03.007. ¨ ., 2004. river flow modeling using artificial neural networks. J. Hydrol. Eng. ASCE 9, Kis¸i, O 60e63. https://doi.org/10.1061/(ASCE)1084-0699(2004)9:1(60). ¨ ., 2005. Daily river flow forecasting using artificial neural networks and auto-regressive Kis¸i, O models. Turk. J. Eng. Environ. Sci. 29, 9e20. ¨ ., 2009. Neural networks and wavelet conjunction model for intermittent streamflow Kis¸i, O forecasting. J. Hydrol. Eng. ASCE 14, 773e782. https://doi.org/10.1061/(ASCE)HE.19435584.0000053. ¨ ., Sanikhani, H., 2015. Prediction of long-term monthly precipitation using several soft Kis¸i, O computing methods without climatic data. Int. J. Climatol. 35, 4139e4150. https://doi.org/ 10.1002/joc.4273. Kumar, D.N., Srinivasa Raju, K., Sathish, T., 2004. River flow forecasting using recurrent neural networks. Water Resour. Manag. 18, 143e161. https://doi.org/10.1023/ B:WARM.0000024727.94701.12. Kumar, D., Singh, A., Samui, P., Jha, R.K., 2019. Forecasting monthly precipitation using sequential modelling. Hydrol. Sci. J. 64, 690e700. https://doi.org/10.1080/ 02626667.2019.1595624. Lapedes, A., Farber, R., 1986. A self-optimizing, nonsymmetrical neural net for content addressable memory and pattern recognition. Physica D Nonlinear Phenom. 22, 247e259. https://doi.org/10.1016/0167-2789(86)90244-7. Proceedings of the Fifth Annual International Conference. Le, X.H., Ho, H.V., Lee, G., Jung, S., 2019. Application of long short-term memory (LSTM) neural network for flood forecasting. Water 11, 1387. https://doi.org/10.3390/w11071387.

168 Advances in Streamflow Forecasting LeCun, Y., Simard, P.Y., Pearlmutter, B., 1993. Automatic learning rate maximization by on-line estimation of the hessian’s eigenvectors. In: Advances in Neural Information Processing Systems, pp. 156e163. Lewis, N.D.C., 2017. Neural Networks for Time Series Forecasting with R: An Intuitive Step by Step Blueprint for Beginners. CreateSpace Independent Publishing Platform. Lin, M., Chen, Q., Yan, S., 2013. Network in Network arXiv preprint arXiv:1312.4400. Lohani, A.K., Kumar, R., Singh, R.D., 2012. Hydrological time series modeling: a comparison between adaptive neuro-fuzzy, neural network and autoregressive techniques. J. Hydrol. 442 (443), 23e35. https://doi.org/10.1016/j.jhydrol.2012.03.031. Markus, M., Salas, J.D., Shin, H.S., 1995. Predicting streamflows based on neural networks. In: International Water Resources Engineering Conference e Proceedings. Presented at the Proceedings of the 1st International Conference on Water Resources. Part 1 (of 2). ASCE, pp. 1641e1646. Muttiah, R.S., Srinivasan, R., Allen, P.M., 1997. Prediction of two-year peak stream-discharges using neural Networks1. J. Am. Water Res. Assoc. (JAWRA) 33, 625e630. https://doi.org/ 10.1111/j.1752-1688.1997.tb03537.x. Nair, V., Hinton, G.E., 2010. Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning. ICML-10), pp. 807e814. Neuneier, R., Zimmermann, H.G., 1998. How to train neural networks. In: Orr, G.B., Mu¨ller, K.-R. (Eds.), Neural Networks: Tricks of the Trade, Lecture Notes in Computer Science. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 373e423. https://doi.org/10.1007/3-540-49430-8_18. Olah, C., 2015. Understanding LSTM Networks. Retrieved from. http://colah.github.io/posts/201508-Understanding-LSTMs. Oord, A.V.D., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., Kavukcuoglu, K., 2016. WaveNet: A Generative Model for Raw Audio arXiv preprint arXiv:1609.03499. Poff, N.L., Tokar, S., Johnson, P., 1996. Stream hydrological and ecological responses to climate change assessed with an artificial neural network. Limnol. Oceanogr. 41, 857e863. https:// doi.org/10.4319/lo.1996.41.5.0857. Pratt, S., Ochoa, A., Yadav, M., Sheta, A., Eldefrawy, M., 2019. Handwritten digits recognition using convolution neural networks. J. Comput. Sci. Coll. 40. Pulido-Calvo, I., Gutierrez-Estrada, J.C., 2009. Improved irrigation water demand forecasting using a soft-computing hybrid model. Biosyst. Eng. 102, 202e218. https://doi.org/10.1016/ j.biosystemseng.2008.09.032. Rahmani-Rezaeieh, A., Mohammadi, M., Mehr, A.D., 2020. Ensemble gene expression programming: a new approach for evolution of parsimonious streamflow forecasting model. Theor. Appl. Climatol. 139 (1e2), 549e564. Rehman, S. ur, Yang, Z., Shahid, M., Wei, N., Huang, Y., Waqas, M., Tu, S.-S., Rehman, O. ur, 2019. Water Preservation in Soan River Basin Using Deep Learning Techniques. CoRR abs/ 1906.10852. Reza, M., Harun, S., Askari, M., 2018. Streamflow forecasting in bukit merah watershed by using ARIMA and ANN. Portal: Jurnal Teknik Sipil 9. https://doi.org/10.30811/portal.v9i1.612. Rosenblatt, F., 1958. The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386e408. https://doi.org/10.1037/h0042519. Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning representations by backpropagating errors. Nature 323, 533e536. https://doi.org/10.1038/323533a0.

Application of different artificial neural network Chapter | 5

169

Sahoo, B.B., Jha, R., Singh, A., Kumar, D., 2019. Long short-term memory (LSTM) recurrent neural network for low-flow hydrological time series forecasting. Acta Geophys. 67, 1471e1481. https://doi.org/10.1007/s11600-019-00330-1. Salas, J.D., Smith, R.A., 1981. Physical basis of stochastic models of annual flows. Water Resour. Res. 17, 428e430. https://doi.org/10.1029/WR017i002p00428. Schmidhuber, J., 2015. Deep learning in neural networks: an overview. Neural Netw. 61, 85e117. https://doi.org/10.1016/j.neunet.2014.09.003. Silva, F.M., Almeida, L.B., 1990. Speeding up backpropagation. In: Eckmiller, R. (Ed.), Advanced Neural Computers. North-Holland, Amsterdam, pp. 151e158. https://doi.org/10.1016/B9780-444-88400-8.50022-4. Sit, M., Demir, I., 2019. Decentralized Flood Forecasting Using Deep Neural Networks arXiv:1902.02308 [cs, stat]. Smolensky, P., 1986. Chapter 6: information processing in dynamical systems: foundations of harmony theory. In: Rumelhart, D.E., McLelland, J.L. (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1. MIT Press, Foundations, pp. 194e281. Sun, Y., Niu, J., Sivakumar, B., 2019. A comparative study of models for short-term streamflow forecasting with emphasis on wavelet-based approach. Stoch. Environ. Res. Risk Assess. 1e17. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A., 2015. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1e9. Tawfik, M., Ibrahim, A., Fahmy, H., 1997. Hysteresis sensitive neural network for modeling rating curves. J. Comput. Civ. Eng. 11 (3), 206e211. https://doi.org/10.1061/(ASCE)08873801(1997)11:3(206). Toth, E., Brath, A., 2007. Multistep ahead streamflow forecasting: role of calibration data in conceptual and neural network modeling. Water Resour. Res. 43 https://doi.org/10.1029/ 2006WR005383. Uysal, G., ſorman, A.A., ſensoy, A., 2016. Streamflow forecasting using different neural network models with satellite data for a snow dominated region in Turkey. Procedia Engineering 154, 1185e1192. https://doi.org/10.1016/j.proeng.2016.07.526. Viswanath, S., Saha, M., Mitra, P., Nanjundiah, R.S., 2019. Deep learning based LSTM and SeqToSeq models to detect monsoon spells of India. In: Rodrigues, J.M.F., Cardoso, P.J.S., Monteiro, J., Lam, R., Krzhizhanovskaya, V.V., Lees, M.H., Dongarra, J.J., Sloot, P.M.A. (Eds.), Computational Science e ICCS 2019, Lecture Notes in Computer Science. Springer International Publishing, pp. 204e218. Vogl, T.P., Mangis, J.K., Rigler, A.K., Zink, W.T., Alkon, D.L., 1988. Accelerating the convergence of the back-propagation method. Biol. Cybern. 59, 257e263. https://doi.org/10.1007/ BF00332914. Waibel, A., 1987. Phoneme recognition using time-delay neural networks. In: Presented at the Meeting of the Institute of Electrical, Information and Communication Engineers (IEICE), Tokyo, Japan. https://doi.org/10.1109/29.21701. Wan, R., Mei, S., Wang, J., Liu, M., Yang, F., 2019. Multivariate temporal convolutional network: a deep neural networks approach for multivariate time series forecasting. Electronics 8, 876. https://doi.org/10.3390/electronics8080876. Wang, J.-H., Lin, G.-F., Chang, M.-J., Huang, I.-H., Chen, Y.-R., 2019. Real-time water-level forecasting using dilated causal convolutional neural networks. Water Resour. Manag. 33, 3759e3780. https://doi.org/10.1007/s11269-019-02342-4.

170 Advances in Streamflow Forecasting Werbos, P.J., 1975. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Ph. D. dissertation. Harvard University. Werbos, P.J., 1990. Backpropagation through time: what it does and how to do it. Proc. IEEE 78 (10), 1550e1560. https://doi.org/10.1109/5.58337. Xu, J., Zhang, W., Zhao, J., 2009. Stream flow forecasting by artificial neural network and TOPMODEL in Baohe River basin. In: 2009 Third International Symposium on Intelligent Information Technology Application Workshops. Presented at the 2009 3rd International Symposium on Intelligent Information Technology Application Workshops (IITAW). IEEE, NanChang, China, pp. 186e189. https://doi.org/10.1109/IITAW.2009.27. Yan, L., Feng, J., Hang, T., 2019. Small watershed stream-flow forecasting based on LSTM. In: Lee, S., Ismail, R., Choo, H. (Eds.), Proceedings of the 13th International Conference on Ubiquitous Information Management and Communication (IMCOM) 2019. Springer International Publishing, Cham, pp. 1006e1014. https://doi.org/10.1007/978-3-030-19063-7_79. Yaseen, Z.M., El-Shafie, A., Jaafar, O., Afan, H.A., Sayl, K.N., 2015. Artificial intelligence based models for stream-flow forecasting: 2000e2015. J. Hydrol. 530, 829e844. https://doi.org/ 10.1016/j.jhydrol.2015.10.038. Yu, X.-H., Chen, G.-A., Cheng, S.-X., 1995. Dynamic learning rate optimization of the backpropagation algorithm. IEEE Trans. Neural Network. 6, 669e677. https://doi.org/10.1109/ 72.377972. Zhou, K., Zheng, Y., Li, B., Dong, W., Zhang, X., 2019. Forecasting different types of convective weather: a deep learning approach. J. Meteorol. Res. 33 (5), 797e809. https://doi.org/10.1007/ s13351-019-8162-6.

Chapter 6

Application of artificial neural network and adaptive neurofuzzy inference system in streamflow forecasting Mehdi Vafakhah, Saeid Janizadeh Department of Watershed Management, Faculty of Natural Resources and Marine Sciences, Tarbiat Modares University, Noor, Mazandaran Province, Iran

6.1 Introduction Streamflow is one of the important hydrological process in rainfall-runoff modeling in a watershed. In this chapter the term streamflow is used in two forms 1) peak discharge (m3/s), and 2) runoff volume (m3). Estimation of the peak discharge and runoff volume are the important steps in flood management and control, design of hydraulic structures, watershed management, and formulation of reservoir operation policies, among others (Gericke and Smithers, 2014). Rainfall-runoff modeling involves a nonlinear and complex process, which is affected by the salient physical and often independent factors such as physiography, geology, and land cover (Kisi et al., 2013). Measurement of the accurate peak discharge and runoff volume in a watershed is a difficult task especially in the developing countries mainly due to limitations of having inadequate number of hydrometric stations and the time and cost incurred in collecting hydrometric data from the low-order drainage streams that are the major areas of watershed-based operations and management (Tayfur and Singh, 2006). Thus, it is required to use hydrological models for estimating the peak discharge and runoff volume, which are based on the other indirect hydrological factors such as rainfall, land use/land cover, soil type, etc. Given the fact of easy access to rainfall data, hydrological models based on rainfall characteristics are to be very practical and rational approach for the estimation of the peak discharge as well as runoff volume from a watershed (Sadeghi et al., 2007).

Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00002-0 Copyright © 2021 Elsevier Inc. All rights reserved.

171

172 Advances in Streamflow Forecasting

In recent years, various rainfall-runoff models have been developed and significant changes have been made in them. The use of rainfall-runoff models dates back to the late 19th century and there are currently several hydrological models to simulate the rainfall-runoff process (Wu and Chau, 2011). Rainfallrunoff models include physical, conceptual, experimental, and artificial intelligence (AI) models (Jothiprakash and Magar, 2009). Nowadays, AI models such as artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS) are widely used for modeling and estimating nonlinear and complicated hydrological processes (Jalalkamali and Jalalkamali, 2018). One of the important characteristics of the AI models is their ability to establish the relationship between the input and output of a process without taking into account the physics of the process (Vafakhah, 2012; Moradi et al., 2019). Tayfur and Singh (2006) modeled flood hydrograph using ANN and fuzzy logic (FL) models in the Alazan Creek Watershed, Texas. The results indicated that both ANN and FL models could simulate flood hydrograph with good performances. Salajegheh et al. (2008) used statistical methods, autoregressive moving average (ARMAX), and two datadriven AI models, i.e., ANN and ANFIS, for flow discharge forecasting in the Karaj and Jajrood watersheds, Iran. The results showed that the hybrid models perform better than the other individual models in forecasting flow discharge and the AI models were better in performance than the linear ARMAX models. Firat and Turan (2010) compared autoregressive (AR), ANN, and ANFIS models in forecasting of the monthly streamflow of the Goksu River in the Seyhan watershed in southern Turkey. They concluded that the ANFIS model performed better than the ANN and AR models. Mukerji et al. (2009) predicted flood hydrograph in Ajay River watershed, India, using three models including ANN, ANFIS, and adaptive neuro-genetic algorithm integrated system (ANGIS). The results showed that the ANGIS model could predict flood hydrograph better than the other two models. The ANFIS and ANN models had similar result in some of the cases, but mostly, the ANFIS model performed better than the ANN model. Dorum et al. (2010) performed rainfallrunoff modeling using multiple regression (MR), ANN, and ANFIS models in the Susurluk watershed, Turkey. The results showed that performance of the ANN and ANFIS models was better in rainfall-runoff modeling than that of the MR model. Bisht and Jangid (2011) predicted discharge of the Rajahmundry River, India, using ANFIS and MR models, and the results showed that the ANFIS model predicted the river discharge better than that predicted by the multilinear regression model. Kisi et al. (2013) comparatively evaluated performance of the three models, i.e., ANN, ANFIS, and gene expression programming (GEP), in simulating the rainfall-runoff process in the Kurukava watershed of northern Turkey. They concluded that the GEP model yielded the better simulation results for the rainfall-runoff modeling. Solgi et al. (2014) used hybrid wavelet-ANN and ANFIS to predict daily precipitation at Verayneh and Nahavand stations in Hamadan, Iran. Their results showed that the combined wavelet-ANN model was better in predicting precipitation than the ANFIS model. Bartoletti et al. (2018) investigated rainfall-runoff modeling

Application of artificial neural network Chapter | 6

173

in Ombrone Pistoiese and Cecina River catchments in Tuscany, Italy, using combinations of ANFIS, principal component analysis (PCA), and Thiessen polygons. The results showed that combination of ANFIS and PCA performed superior to the combination of ANFIS and Thiessen polygons for rainfallrunoff modeling. Nourani et al. (2019) evaluated the performance of hybrid wavelet and M5 decision tree for rainfall-runoff modeling in Aji Chai catchment, Iran. The results showed that the hybrid wavelet-M5 model was more efficient in modeling the rainfall-runoff process than the individual models. Looking at the increasing applications of the AI models of ANN and ANFIS in the recent studies reported from all over the world, these models might be considered as the appropriate for the prediction of flood peak discharge and runoff volume for the efficient reservoir management and flood control. Furthermore, the easy availability of early and frequently recorded rainfall data in most of the watersheds enhances the practical value of the ANN and ANFIS models in the prediction of peak discharge and runoff. This chapter presents an overview of the ANN and ANFIS models along with theoretical details. Subsequently, a case study is presented in the chapter where application of both the ANN and ANFIS models is demonstrated in estimation of the flood peak discharge and runoff volume based on rainfall characteristics in Kasilian watershed, Mazandaran province, Iran. The case study also compared performance of both the ANN and ANFIS models in predicting the peak discharge and runoff.

6.2 Theoretical description of models 6.2.1 Artificial neural network An ANN is an idea for information processing that is inspired by the biological nervous system and processes information like the brain. This system is made up of a large number of highly interconnected processing elements called neurons that work together to solve a problem (Vafakhah, 2012). Multilayer perceptron (MLP) is one of the most widely used neural networks, especially in modeling applications that have been used in recent decades in various fields of science and engineering (Riedmiller and Lernen, 2014). This network consists of an input layer, an output layer, and one or more layers between them that are not directly connected to the input data and output results. The input layer units are solely responsible for distributing the amount of inputs to the next layer, and the output layer also provides the response of the output signals. In these two layers, the number of neurons is equal to the number of inputs and outputs. The hidden layers are responsible for connecting the input and output layers (Riedmiller and Lernen, 2014). The number of hidden layer and the number of neurons in these hidden layers cannot be simply determined by the type of problem as is done to decide the number of neurons for the input and output layers. The number of hidden layers and their number of neurons are determined by adopting the trial and error process in such a way that the

174 Advances in Streamflow Forecasting

neural network derived from this architecture performs the best in making reliable and accurate predictions (Vafakhah, 2012; Kisi et al., 2013; Vafakhah et al., 2014; Vafakhah and Kahneh, 2016). Accurate and realistic analysis is very complicated to find the total number of hidden layer neurons. However, it can be said that the number of neurons in the hidden layer is a function of the number of input elements as well as the maximum number of areas of the input space that are linearly separated. Each neuron connects to the neurons of the next layer by output, but not to the neurons of its own layer. The output of each neuron is defined by Eq. (6.1). ! n X a¼f Pi Wj:i þ bj (6.1) i¼1

where Wj:i the amount of weight of the connection between the jth neurons of the mentioned layer and the ith neurons of the previous layer is the expression of the importance of interconnection in two consecutive layers, bj the bias weight for the jth neuron, Pi the amount of output from the ith layer neuron of the previous layer, a the amount of output from the neuron jth, and f the neuron threshold function jth (Coulibaly et al., 2001). Many functions can be used to move numbers from the previous layer to the next layer. These include sigmoid, Gauss, hyperbolic tangent, and hyperbolic secant (Coulibaly et al., 2001).

6.2.2 Adaptive neuro-fuzzy inference system One of the most common neuro-fuzzy systems is the ANFIS system introduced by Jang (1993). In this model, a combination of FL and neural network is used (Kisi et al., 2013). Modeling of adaptive neuro-fuzzy networks known as a powerful tool that can combine information from different sources, such as experimental models, revelations, and data, facilitates the development of effective models. ANFIS model describes system based on “if-then” rules to be implemented in the network structure learning algorithms used in ANNs applied to them. The construction of an ANFIS model is combined with both nodes and rules. Nodes are functioning as membership functions (MFs) while the rules permit one to model the connections between a dependent variable and the independent variables. In designing an ANFIS model, there are several sorts of MFs (e.g., sigmoid, triangular, Gaussian, trapezoidal, etc.) (Vafakhah, 2013; Ghanjkhanlo et al., 2020). Basic ANFIS model is shown in Fig. 6.1, in which there are two inputs X1 and X2 and the output is Y. This model architecture with two fuzzy if-then rules is based on the below equations as follows:

Application of artificial neural network Chapter | 6

175

FIGURE 6.1 Typical adaptive neuro-fuzzy inference system architecture.

Rule 1: if X1 is C1 and X2 is E1 then D1 ¼ p1  X1 þ q1  X2 þ h1 Rule 2: if X1 is C2 and X2 is E2 then D2 ¼ p2  X1 þ q2  X2 þ h2 where X1 and X2 are the inputs, C1 , C2 , E1 , and E2 are the input MFs, p1 , q1 , and h1 are the parameters of the output function which are determined during the training process, D1 and D2 are outputs of the fourth layer, and Y is output (Ghanjkhanlo et al., 2020).

6.3 Application of ANN and ANFIS for prediction of peak discharge and runoff: a case study 6.3.1 Study area description The case study presented in this chapter is performed in the Kasilian watershed located as one of the subbasins of the Talar River of Iran. The study area encompasses an area of 66.75 km2 located geographically from of 3558՜30‫ ﹰ‬to 3607՜15‫ ﹰ‬north latitude and from 5308՜44‫ ﹰ‬to 5315՜42‫ ﹰ‬east longitude with the elevation ranging from 1100 to 2700 m above mean sea level (a.s.l.) and an average elevation of 1576 m a.s.l. Location map of the study area is shown in Fig. 6.2. The average annual rainfall of the study area is 783.4 mm. The climate of the watershed is very humid according to the Domarten climatic classification (Saghafian et al., 2015).

6.3.2 Methodology In the study area, one rain gauge station and one hydrometric station are equipped with concrete Parshall flume, stage, and limnograph. All rainfall

176 Advances in Streamflow Forecasting

FIGURE 6.2 Location map of study area: (A) Iran country, (B) Mazadaran Province, and (C) Kasilian watershed.

hyetographs recorded at Sangdeh synoptic rain gauge station located at the gravity center of the basin and flood hydrographs recorded at Valik Bon hydrometric station located at the watershed outlet were obtained for the period 1975e2009 from the Iran Water Resources Research Company (IWRRC), Iran. The individual flood hydrographs and the corresponding rainfall hyetographs were then extracted from 1975 to 2009, and 60 of the total rainfallrunoff events were considered for the case study. Before performing AI modeling, baseflow was separated from all the flood hydrographs using graphical baseflow separation technique, i.e., constant slope method. A total of 15 variables related to rainfall events were determined without baseflow in all the events. The independent rainfall variables that were derived for each event from hyetograph include (i) total rainfall amount, (ii) total rainfall duration, (iii) excess rainfall, (iv) the center of mass of rainfall excess, (v) the center of mass of total rainfall, (vi) average rainfall intensity, (vii) maximum 30-min rainfall intensity, (viii) time maximum occurrence 30min intensity of rainfall, (ix) maximum 15-min rainfall intensity, (x) time maximum occurrence 15-min intensity of rainfall, (xi) rainfall amount in first quartile, (xii) rainfall amount in second quartile, (xiii) rainfall amount in third quartile, (xiv) rainfall amount in fourth quartile, and (xv) excess rainfall duration. The amount and duration of rainfall excess were computed using 4 index method (McCuen, 2005). The runoff volume and flood peak discharge

Application of artificial neural network Chapter | 6

177

were determined for each hydrograph and were considered as the dependent variables. The dates of rainfall-runoff events used in this study are given in Table 6.1. Some of the main characteristics of the rainfall-runoff events are summarized in Table 6.2. Modeling of flood properties, i.e., peak discharge and runoff volume, consists of five main steps: (i) extraction of different rainfall characteristics affecting flood properties, (ii) selection of independent variables using PCA as input to the model, (iii) development of data mining models, (iv) model evaluation and validation, and (v) sensitivity analysis and determination of the most important variables. Detailed systematic methodology followed in modeling the rainfall-runoff using the ANN and ANFIS AI models in this study is illustrated through a flowchart shown in Fig. 6.3.

6.3.2.1 Principal component analysis PCA is a way of identifying data patterns and expressing data in a way that can highlight their similarities and differences. In other words, PCA is a multivariate statistical data mining method used to reduce data size while preserving as much of the changes in the dataset as possible for faster and more efficient data processing (Jolliffe, 2011). Using PCA, a large number of variables can be reduced to significantly important factors, and thus, the resultant factors provide a summary of the original data (Wang et al., 2009). In fact, the PCA method is used to maximize the sum of the squares of correlations. This method allows the researcher to access the vector of the first principal factor. This factor is linearly related to the main variables and has the highest sum of squares correlated with the variables. The specific vector corresponding to the maximum specific value gives the desired weight correlation matrix. To ensure the appropriateness of the data for PCA, which is one of the basic models of factor analysis, the KaisereMeyereOlkin (KMO) and Bartlett test should be used. KMO should be one in the ideal case. High KMO values indicate a PCA with few errors, overall. If KMO is more than 0.5, PCA could be used (Abdi and Williams, 2010; Kheirfam and Vafakhah, 2015). The main purpose of using PCA is data reduction by determining the most important variables in the formation of phenomenon or the system. In this study, using SPSS version 21 software and performing PCA for 15 variables in 60 selected storms, the factor weight matrix was obtained using the data and the main PCA. PCA in nonrotational and rotational modes (Varimax method) had similar answers. Finally, varimax era, which is one of the common methods, was selected as a suitable method for selecting axes. 6.3.2.2 Artificial neural network In this case study, feedforward neural network technique with backpropagation (BP) learning rule was used. A three-layer perceptron network with sigmoid and hyperbolic tangent transfer functions in the hidden layer and linear

Dataset Training

Validation

Testing

Date September 28, 1975

October 18, 1976

October 29, 1976

June 2, 1978

October 20, 1981

December 6, 1981

October 6, 1982

October 12, 1983

May 5, 1984

October 7, 1984

May 21, 1985

November 26, 1985

January 4, 1986

May 4, 1986

November 5, 1986

October 26, 1987

November 7, 1987

January 30, 1988

September 21, 1988

November 20, 1988

March 15, 1989

April 28, 1989

May 24, 1991

July 14, 1991

October 5, 1991

June 20, 1992

June 3, 1993

July 11, 1993

February 13, 1994

November 26, 1994

March 17, 1995

June 16, 1995

October 12, 1995

May 5, 1997

July 2, 1997

May 5, 2004

November 8, 2005

September 17, 2006

November 8, 2006

April 2, 2007

May 14, 2007

August 15, 2009

October 20, 1987

August 31, 1988

September 2, 1990

October 2, 1990

June 19, 1991

May 6, 1993

October 15, 1996

November 15, 2006

March 2, 2007

May 9, 1982

May 17, 1986

May 29, 1987

May 30, 1987

October 7, 1992

October 22, 1994

October 6, 1996

July 24, 2003

March 27, 2007

178 Advances in Streamflow Forecasting

TABLE 6.1 Rainfall-runoff events used in this study for training, validation, and testing of the artificial intelligence models.

Application of artificial neural network Chapter | 6

179

TABLE 6.2 Descriptive statistics of 15 rainfall related variables and the peak runoff discharge. S. No.

Variable

Maximum

Minimum

Mean

Standard division

1

Total rainfall amount

55.63

3.79

16.44

10.43

2.01

2

Total rainfall duration

30.75

1

7.31

6.21

2.03

3

Rainfall excess

12.83

0.30

2.74

2.66

2.03

4

The center of mass of rainfall excess

16.01

0.12

2.79

2.72

2.63

5

The center of mass of total rainfall

12.88

0.33

3.37

2.90

1.96

6

Average rainfall intensity

19.44

0.58

3.26

3.06

3.41

7

Maximum 30-min rainfall intensity

38.30

1

7.74

7.35

2.94

8

Time maximum occurrence 30-min intensity of rainfall

22.50

0

2.58

3.59

3.38

9

Maximum 15-min rainfall intensity

48.12

1

9.94

9.09

2.78

10

Time maximum occurrence 15-min intensity of rainfall

22.50

0

2.63

3.56

3.48

Skewness

Continued

180 Advances in Streamflow Forecasting

TABLE 6.2 Descriptive statistics of 15 rainfall related variables and the peak runoff discharge.dcont’d S. No.

Variable

Maximum

Minimum

Mean

Standard division

11

First quarter of total rainfall amount

21.04

0.63

5.13

4.61

1.87

12

Second quarter of total rainfall amount

20.35

0.27

4.66

3.54

2.14

13

Third quarter of total rainfall amount

12.89

0.86

4.04

2.64

1.42

14

Fourth quarter of total rainfall amount

14.70

0.25

2.75

2.87

2.26

15

Time of rainfall excess

8

0

2.02

1.76

1.25

16

Peak discharge

17.15

0.57

4.63

4.01

1.34

Skewness

function in the output layer was used for modeling the peak discharge and runoff volume. The number of neurons in the hidden layer was determined by adopting the trial and error process. A care was taken to properly generalize the network for avoiding overfitting of the developed model. The overfitting of the MLP neural network model developed in this study was avoided by using an automatic adjustment method. In this automatic adjustment method, automatic parameter estimation is very desirable, which is achieved by using the LevenbergeMarquardt training algorithm (Vafakhah, 2012).

6.3.2.3 Adaptive neuro-fuzzy inference system The ANFIS is, in fact, a combination of ANNs and FL systems. In ANFIS, the ANN and FL are combined using an ANFIS FL toolbox and parameters of fuzzy MF are adjusted by the BP method alone or with the least squares method (Jang, 1993). The ANFIS model has been trained based on the collection of input-output data. This means that the system learns to produce the outputs that have been trained with the given input. This is done by

Application of artificial neural network Chapter | 6

FIGURE 6.3 Flowchart illustrating step-by-step procedure for applying artificial intelligence models for prediction of peak discharge and runoff volume.

181

182 Advances in Streamflow Forecasting

correcting the MF parameters based on the selected error criterion. In order to measure the quality of the ANFIS model in predicting the set of output values, a validation procedure is performed. In this study, fuzzification of data was done using two methods: (i) grid partitioning (GP) with triangular (trimf), generalized bell (gbellmf), and Gaussian (gaussmf) MFs and (ii) subtractive clustering (SC). On the other hand, defuzzification of data was done using weighted average of all rule outputs (wtaver). In AI modeling, 70% of the observed rainfall-runoff data were used for model training, 15% data were used for model validation, and remaining 15% data were used for model testing.

6.3.2.4 Assessment of model performance by statistical indices In this study, performance of the models is assessed using the three statistical indices of NasheSutcliffe efficiency (NSE), root mean square error (RMSE), and coefficient of determination (R2 ). The statistical performance assessment indices are mathematically expressed ahead (Kisi et al., 2013): n P

ðQo  Qe Þ2 NSE ¼ 1   2 n P Qo  Qo i¼1

(6.2)

i¼1

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n 1X ðQo  Qe Þ2 RMSE ¼ n i¼1    Qo  Qo Qe  Qe i¼1 R2 ¼ sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2 n  2ffi n P P Qo  Qo Qe  Qe

(6.3)

n P

i¼1

(6.4)

i¼1

where Qo is the observed value of peak flood discharge and runoff, Qo is the observed mean value of peak flood discharge and runoff, Qe is the modelpredicted value of peak flood discharge and runoff, and Qe is the modelpredicted mean value of peak flood discharge and runoff.

6.3.2.5 Sensitivity analysis Sensitivity analysis identifies the effect of one specific independent variable on dependent variable, keeping other independent variables fixed (Buragohain and Mahanta, 2008). In this study, the independent variables selected by the PCA (i.e., time maximum occurrence 30-min intensity of rainfall, total rainfall amount, maximum 30-min rainfall intensity, and time of rainfall excess) were subjected to sensitivity analysis using MATLAB software to find the effect of each variable on the peak discharge and runoff volume. The two methods for

Application of artificial neural network Chapter | 6

183

sensitivity analysis include local and global. According to the global sensitivity analysis interests, the entire field of possible variation of the input variables, by determining the input variables of a model that contribute most to an amount of interest calculated using this model (Pichery, 2014) so in this study global sensitivity analysis was used. Finally, the variables with the relative higher weights were selected as the most effective variables in determining the peak discharge and runoff volume (Buragohain and Mahanta, 2008).

6.4 Results and discussion This study was carried out to investigate the relative performance of the two AI models, i.e., the ANN and ANFIS in estimation of flood peak discharge and runoff volume in Kasilian watershed of Mazandaran province, Iran, using 60 rainfall-runoff events. Tables 6.3 and 6.4 show results of the PCA in terms of eigenvalues, total variance, and factor loading matrix. According to Tables 6.3 and 6.4, the four extracted principal components (PCs) can explain about 86% of the total variance of the system. It is further revealed that the first PC has a direct relationship with the center of mass of rainfall excess, the center of mass of total rainfall, time maximum occurrence 30-min intensity of rainfall, and time maximum occurrence 15-min intensity of rainfall. Hence, the first PC consists of variables that are correlated with the time variations. As the time maximum occurrence 30-min intensity of rainfall has the highest weight on the first PC, the former was selected as the first PC. The second PC has a direct relationship with total rainfall amount, the first and second quarters of total rainfall amount. Hence, the second PC consists of the variables that are related to the amount of rainfall. The highest weight of total rainfall amount is on the second PC; it can explain the other variables associated with it. On the third PC, the maximum 30-min rainfall intensity and on the fourth PC, the rainfall excess were selected as the major factors defining the corresponding PCs.

TABLE 6.3 Eigenvalues and total variance explained by the components selected from principal component analysis. Initial eigenvalues Component

1

2

3

4

Total

6.09

4.01

1.66

1.15

Percentage of variance (%)

40.56

26.7

11.07

7.67

Cumulative percentage (%)

40.58

67.28

78.53

86.03

184 Advances in Streamflow Forecasting

TABLE 6.4 Values of factor loadings for the first four principal components. Principal component Variable

1

2

3 a

4

Total rainfall amount

0.19

0.94

0.22

Total rainfall duration

0.6

0.62

0.34

0.13

Rainfall excess

0.19

0.18

0.08

0.91a

The center of mass of rainfall excess

0.93a

0.21

0.14

0.21

a

0.56

0.31

0.17

The center of mass of total rainfall

0.71

Average rainfall intensity

0.24

Maximum 30-min rainfall intensity

0.08

Time maximum occurrence 30-min intensity of rainfall Maximum 15-min rainfall intensity Time maximum occurrence 15-min intensity of rainfall

0.08

a

0.1

a

0.92

0.2

0.95

0.15

0.98

0.05

0.05

0.11

0.07

0.18

0.94a

a

a

0.97

0.05

0.16

0.05

0.12

0.01

0.84a

0.24

0.03

Second quarter of total rainfall amount

0.08

a

0.72

0.37

0.07

Third quarter of total rainfall amount

0.13

0.63

0.01

0.41

Fourth quarter of total rainfall amount

0.48

0.6

0.05

0.02

Time of rainfall excess

0.16

0.03

0.27

First quarter of total rainfall amount

0.85a

a

Bold values indicate the numbers more than 0.7.

6.4.1 Results of ANN modeling Values of the three performance assessment indices, i.e., R2, RMSE, and NSE, for the ANN models for training, validation, and testing data periods are given in Table 6.5. It is seen that in the estimation of peak discharge variable, hyperbolic tangent function showed the better performance in model testing stage with R2 of 0.86, RMSE of 1.28 m3/s, and NSE of 0.82 in comparison to performance of the sigmoid function (Fig. 6.4). However, in case of estimation of the runoff volume variable, sigmoid function in model testing stage yielded the better results with R2 of 0.98, RMSE of 10,282.82 m3/s, and NSE of 0.98 than the results obtained from hyperbolic tangent function (Fig. 6.5).

6.4.2 Results of ANFIS modeling Values of the statistical model assessment indices, i.e., R2, RMSE, and NSE, for ANFIS model in training, validation, and testing data periods are given in

Application of artificial neural network Chapter | 6

185

TABLE 6.5 Values of the three performance assessment indices for artificial neural network models.

Variable Peak discharge (m3/s)

Model architecture 4-8-1

4-19-1

Runoff volume (m3)

4-11-1

4-11-1

Transfer function in the hidden layer Hyperbolic tangent

Sigmoid

Hyperbolic tangent

Sigmoid

Stage

R2

RMSE

NSE

Training

0.91

1.13

0.9

Validation

0.79

3.44

0.54

Testing

0.86

1.28

0.82

Training

0.83

1.38

0.84

Validation

0.71

4.21

0.32

Testing

0.79

2.05

0.59

Training

0.99

3,464.47

0.99

Validation

0.88

83,431.09

0.85

Testing

0.97

19,356.35

0.97

Training

0.99

4,857.5

0.99

Validation

0.83

109,402.2

0.74

Testing

0.98

10,282.82

0.98

2

Bold values indicate the best performance in terms of R , root mean square error (RMSE), and Nashe Sutcliffe efficiency (NSE).

FIGURE 6.4 Observed and estimated peak discharge with artificial neural network model in training, validation, and testing periods.

186 Advances in Streamflow Forecasting

FIGURE 6.5 Observed and estimated runoff volume with artificial neural network model in training, validation, and testing periods.

Table 6.6. It is apparent that in the peak discharge prediction, SC method performed better in model testing stage with R2 of 0.95, RMSE of 1.22 m3/s, and NSE of 0.85 than the performance of GP method (Fig. 6.6). Similarly, in runoff volume prediction, the SC method yielded the better results with R2 of 0.99, RMSE of 2369.54 m3/s, and NSE of 0.99 than obtained from the GP method (Fig. 6.7). This finding is in agreement with that reported by Salajegheh et al. (2008). The main difference between the SC and GP methods is in the way of determining the MF. When the number of input variables is small, the GP method is more suitable for data classification. However, if the number of input variables is high, the training speed of the ANFIS-SC model is better than that of the ANFIS-GP model. Due to the large number of variables selected in this study, the SC method showed a better performance than the GP method. The results of this case study differ from the results of a few past studies dealing with prediction of groundwater levels where the ANFIS-GP with gbellmf has less error (Moosavi et al., 2013; Shirmohammadi et al., 2013). It is obvious from Tables 6.5 and 6.6 that the ANFIS models performed much better than the ANN models for predicting both the peak discharge and runoff volume with the least error and the high efficiency. The better performance of the ANFIS models in prediction of the peak discharge and runoff may be attributed to their ability to teach the neural network as well as inference capability FL. Finally, this study concludes that the ANFIS model had a better efficiency for modeling the rainfall-runoff process, and this finding is in agreement with that reported by earlier researchers (e.g., El-Shafi et al., 2011; Kisi et al., 2013; Wang et al., 2009; Dehghani et al., 2016; Vafakhah and Kahneh, 2016). Moreover, the results obtained from sensitivity analysis are given in Table 6.7. It is seen from Table 6.7 that rainfall excess is the most effective factor that the maximum influence in estimation of the peak discharge and runoff volume. Also, rainfall excess is one of the main factors mostly affecting the shape of flood hydrograph.

TABLE 6.6 Values of the three performance assessment indices for adaptive neuro-fuzzy inference system models.

Trimf 2

Variable

Stage

R

Peak discharge (m3/s)

Training

0.96

0.63

0.96

0.99

Validation

0.5

4.71

0.12

Testing

0.27

3.13

Training

0.99

Validation

0.58

288,852

0.78

Testing

0.96

49,255

0.94

Runoff volume (m3)

RMSE

Gbellmf NSE

2

1,131.3

R

RMSE

Gussmf 2

RMSE

Subtractive clustering

R

0.2

0.99

0.98

0.98

0.92

0.99

0.08

0.99

0.46

4.48

0.17

0.46

0.26

0.19

0.76

3.59

0.5

0.08

0.89

2.79

0.38

0.86

1.42

0.93

0.95

1.26

0.84

0.99

0.99

687.7

0.99

0.99

0.99

0.99

0.9

73,737.8

0.88

0.88

36,945

0.99

64,771.1

0.9

0.98

14,962

1,253.1

NSE

R2

NSE

1.9 0.98

Trimf, triangular-shaped membership function, Gbellmf, bell-shaped membership function, Gussmf, Gaussian-shaped membership function.

0.91 0.99

RMSE

4,958.8 70,807 2369.5

NSE

0.99 0.89 0.99

Application of artificial neural network Chapter | 6

Grid partitioning

187

188 Advances in Streamflow Forecasting

FIGURE 6.6 Observed and estimated peak discharge with adaptive neuro-fuzzy inference system in training, validation, and testing periods.

FIGURE 6.7 Observed and estimated runoff volume with adaptive neuro-fuzzy inference system in training, validation, and testing periods.

TABLE 6.7 Results of sensitivity analysis of model input parameters. Variable

Peak discharge (m3/s)

Runoff volume (m3)

Time maximum occurrence 30-min intensity of rainfall

1.21

0.94

Total rainfall amount

1.19

4.34

Maximum 30-min rainfall intensity

2.12

2.46

Rainfall excess

9.28

1783.17

6.5 Conclusions Modeling of flood peak discharge and runoff volume is one of the most important tasks for making hydrological forecasting in water resources management. It is revealed from the literature that AI models are increasingly

Application of artificial neural network Chapter | 6

189

employed in forecasting flood discharge and runoff in the recent past. This chapter presented an overview of the ANN and ANFIS models. In addition, a case study is presented where the performances of ANFIS and ANN models were comparatively evaluated for modeling flood peak discharge and runoff volume based on 15 rainfall related variables with 60 rainfall-runoff events in Kasilian watershed of northern Iran. PCA method was used to determine the influential input variables for modeling. The results showed that the ANFIS models performed better than the ANN models. The ANFIS models with SC method performed better than the ANFIS models with GP method. Sensitivity analysis of the input variables indicated that rainfall excess variable has a relatively higher impact on the peak discharge and runoff volume modeling. The approach demonstrated in this chapter may serve as a reference for the future studies especially where flood hydrograph is not available. The case study presented in this chapter involved only rainfall variables for modeling the flood hydrograph; however, in future studies, catchment characteristics may also be used for modeling the flood hydrograph and their accuracies may be compared with each other.

References Abdi, H., Williams, L.J., 2010. Principal component analysis. Wiley Interdiscip. Rev.: Comput. Statist. 2 (4), 433e459. https://doi.org/10.1002/wics.101. Bartoletti, N., Casagli, F., Marsili-Libelli, S., Nardi, A., Palandri, L., 2018. Data-driven rainfall/ runoff modelling based on a neuro-fuzzy inference system. Environ. Model. Software 106, 35e47. https://doi.org/10.1016/j.envsoft.2017.11.026. Bisht, D.C., Jangid, A., 2011. Discharge modelling using adaptive neuro-fuzzy inference system. Int. J. Adv. Sci. Technol. 31 (1), 99e114. Buragohain, M., Mahanta, C., 2008. A novel approach for ANFIS modelling based on full factorial design. Appl. Soft Comput. 8 (1), 609e625. https://doi.org/10.1016/j.asoc.2007.03.010. Coulibaly, P., Anctil, F., Aravena, R., Bobe´e, B., 2001. Artificial neural network modeling of water table depth fluctuations. Water Resour. Res. 37 (4), 885e896. https://doi.org/10.1029/ 2000WR900368. Dehghani, N., Vafakhah, M., Bahremand, A., 2016. Rainfall-runoff modeling using artificial neural network and neuro-fuzzy inference system in kasilian watershed. J. Watershed Manage. Res. 7 (13), 128e137. https://doi.org/10.18869/acadpub.jwmr.7.13.137 (In Persian). Dorum, A., Yarar, A., Sevimli, M.F., Onu¨c¸yildiz, M., 2010. Modelling the rainfallerunoff data of Susurluk basin. Expert Syst. Appl. 37 (9), 6587e6593. https://doi.org/10.1016/ j.eswa.2010.02.127. El-Shafie, A., Jaafer, O., Akrami, S.A., 2011. Adaptive neuro-fuzzy inference system based model for rainfall forecasting in Klang River, Malaysia. Int. J. Phys. Sci. 6 (12), 2875e2888. Firat, M., Turan, M.E., 2010. Monthly river flow forecasting by an adaptive neuro-fuzzy inference system. Water Environ. J. 24 (2), 116e125. https://doi.org/10.1111/j.1747-6593.2008.00162.x. Ghanjkhanlo, H., Vafakhah, M., Zeinivand, H., Fathzadeh, A., 2020. Prediction of snow water equivalent using artificial neural network and adaptive neuro-fuzzy inference system with two sampling schemes in semi-arid region of Iran. J. Mt. Sci. (7), 1712e1723. https://doi.org/ 10.1007/s11629-018-4875-8.

190 Advances in Streamflow Forecasting Gericke, O.J., Smithers, J.C., 2014. Review of methods used to estimate catchment response time for the purpose of peak discharge estimation. Hydrol. Sci. J. 59 (11), 1935e1971. https:// doi.org/10.1080/02626667.2013.866712. Jalalkamali, A., Jalalkamali, N., 2018. Adaptive network-based fuzzy inference system-genetic algorithm models for prediction groundwater quality indices: a GIS-based analysis. J. AI Data Min. 6 (2), 439e445. https://doi.org/10.22044/JADM.2017.1086. Jang, J.S., 1993. ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans. Syst., Man, Cybernet. 23 (3), 665e685. https://doi.org/10.1109/21.256541. Jothiprakash, V., Magar, R., 2009. Soft computing tools in rainfall-runoff modeling. ISH J. Hydraulic Eng. 15 (Suppl. 1), 84e96. https://doi.org/10.1080/09715010.2009.10514970. Jolliffe, I., 2011. Principal Component Analysis. Springer Berlin Heidelberg, pp. 1094e1096. https://doi.org/10.1007/978-3-642-04898-2_455. Kheirfam, H., Vafakhah, M., 2015. Assessment of some homogeneous methods for the regional analysis of suspended sediment yield in the south and southeast of the Caspian Sea. J. Earth Syst. Sci. 124 (6), 1247e1263. https://doi.org/10.1007/s12040-015-0604-7. Kisi, O., Shiri, J., Tombul, M., 2013. Modeling rainfall-runoff process using soft computing techniques. Comput. Geosci. 51, 108e117. https://doi.org/10.1016/j.cageo.2012.07.001. McCuen, R.H., 2005. Hydrologic Analysis and Design, vol. 3. Pearson prentice hall, Upper Saddle River, NJ, p. 859. Moosavi, V., Vafakhah, M., Shirmohammadi, B., Behnia, N., 2013. A wavelet-ANFIS hybrid model for groundwater level forecasting for different prediction periods. Water Resour. Manag. 27 (5), 1301e1321. https://doi.org/10.1007/s11269-012-0239-2. Moradi, H., Avand, M.T., Janizadeh, S., 2019. Landslide susceptibility survey using modeling methods. In: Spatial Modeling in GIS and R for Earth and Environmental Sciences. Elsevier, pp. 259e275. Mukerji, A., Chatterjee, C., Raghuwanshi, N.S., 2009. Flood forecasting using ANN, neuro-fuzzy, and neuro-GA models. J. Hydrol. Eng. ASCE 14 (6), 647e652. https://doi.org/10.1061/ (ASCE)HE.1943-5584.0000040. Nourani, V., Davanlou Tajbakhsh, A., Molajou, A., Gokcekus, H., 2019. Hybrid wavelet-M5 model tree for rainfall-runoff modeling. J. Hydrol. Eng. ASCE 24 (5), 04019012. https://doi.org/ 10.1061/(ASCE)HE.1943-5584.0001777. Pichery, C., 2014. Sensitivity analysis. In: Encyclopedia of Toxicology. Elsevier, pp. 236e237. Riedmiller, M., Lernen, A.M., 2014. Multi layer perceptron. In: Machine Learning Lab Special Lecture. University of Freiburg, pp. 7e24. Sadeghi, S.H.R., Mozayyan, M., Moradi, H.M., 2007. Development of hydrograph using different rainfall components in kasilian watershed. J. Iran. Nat. Res. 60 (1), 33e43 (In Persian). Saghafian, B., Meghdadi, A.R., Sima, S., 2015. Application of the WEPP model to determine sources of run-off and sediment in a forested watershed. Hydrol. Process. 29 (4), 481e497. https://doi.org/10.1002/hyp.10168. Salajegheh, A., FathabadiA, Mahdavi, M., 2008. Performance of fuzzy neural techniques and statistical model sin simulating the rainfallerunoff. J. Iran. Nat. Res. 62 (1), 65e79 (In Persian). Shirmohammadi, B., Vafakhah, M., Moosavi, V., Moghaddamnia, A., 2013. Application of several data-driven techniques for predicting groundwater level. Water Resour. Manag. 27 (2), 419e432. https://doi.org/10.1007/s11269-012-0194-y.

Application of artificial neural network Chapter | 6

191

Solgi, A., Nourani, V., Pourhaghi, A., 2014. Forecasting daily precipitation using hybrid model of wavelet-artificial neural network and comparison with adaptive neurofuzzy inference system (case study: Verayneh station, Nahavand). Adv. Civ. Eng. 2014, 12. Article ID 279368. https:// doi.org/10.1155/2014/279368. Tayfur, G., Singh, V.P., 2006. ANN and fuzzy logic models for simulating event-based rainfallrunoff. J. Hydraul. Eng. ASCE 132 (12), 1321e1330. https://doi.org/10.1061/(ASCE)07339429(2006)132:12(1321). Vafakhah, M., 2012. Application of artificial neural networks and adaptive neuro-fuzzy inference system models to short-term streamflow forecasting. Can. J. Civ. Eng. 39 (4), 402e414. https://doi.org/10.1139/l2012-011. Vafakhah, M., 2013. Comparison of cokriging and adaptive neuro-fuzzy inference system models for suspended sediment load forecasting. Arab. J. Geosci. 6 (8), 3003e3018. https://doi.org/ 10.1007/s12517-012-0550-5. Vafakhah, M., Janizadeh, S., Khosrobeigi Bozchaloei, S., 2014. Application of several data-driven techniques for rainfall-runoff modeling. ECOPERSIA 2 (1), 455e469. Vafakhah, M., Kahneh, E., 2016. A comparative assessment of adaptive neuro-fuzzy inference system, artificial neural network and regression for modelling stage-discharge relationship. Int. J. Hortic. Sci. Technol. 6 (2), 143e159. https://doi.org/10.1504/IJHST.2016.075581. Wang, W.C., Chau, K.W., Cheng, C.T., Qiu, L., 2009. A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series. J. Hydrol. 374 (3e4), 294e306. https://doi.org/10.1016/j.jhydrol.2009.06.019. Wu, C.L., Chau, K.W., 2011. Rainfallerunoff modeling using artificial neural network coupled with singular spectrum analysis. J. Hydrol. 399 (3e4), 394e409. https://doi.org/10.1016/ j.jhydrol.2011.01.017.

Chapter 7

Genetic programming for streamflow forecasting: a concise review of univariate models with a case study Ali Danandeh Mehr1, Mir Jafar Sadegh Safari2 1 Department of Civil Engineering, Antalya Bilim University, Antalya, Turkey; 2Department of Civil Engineering, Yas¸ar University, Izmir, Turkey

7.1 Introduction Over the past few decades, there has been a considerable research on soft computing (SC) tools that has increased the human awareness in complicated issues and problems related to various engineering aspects. Generally, SC is defined as the use of inexact solutions to computationally hard tasks. Some of the well-known SC techniques include fuzzy logic, artificial neural networks (ANNs), evolutionary computing, and decision trees that have been used extensively to solve a wide range of engineering problems (Huang et al., 2010; Roy et al., 2012; Danandeh Mehr and Nourani, 2018; Safari and Mehr, 2018; Danandeh Mehr et al., 2019). In many cases, SC techniques have been used mainly to discover a mathematical relationship between the empirically observed variables, which is called a symbolic system identification or black-box modeling. Once a model is discovered and verified, it can be used to predict future values of the state variables of the system (Ghorbani et al., 2018). Genetic programming (GP) is one of the most popular evolutionary computing techniques that used Darwinian algorithm to solve a problem (Koza, 1992). In any GP variant, a population of random programs (potential solutions) is created at first and then the genetic materials of each individual are repetitively improved via evolutionary operations to achieve a desired state. GP has received a great deal of attention over the past few decades and

Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00007-X Copyright © 2021 Elsevier Inc. All rights reserved.

193

194 Advances in Streamflow Forecasting

has been applied to many research areas including engineering, medicine, and business (Obiedat et al., 2013; Danandeh Mehr et al., 2018). It has shown in literature studies that GP can be satisfactorily used to solve classification, pattern recognition, and time series modeling problems (Jabeen and Baig, 2010; Sattar et al., 2017; Hrnjica and Danandeh Mehr, 2019). A comprehensive review paper by Danandeh Mehr et al. (2018) narrated how GP is beneficial to solve a variety of problems in water resources engineering and explored its popularity in studies related to the hydrological predictions. It was also highlighted that the GP is a gray-box model that allows the modeler to exert human knowledge into the SC algorithm. Modeling of streamflow process is an important task for planning, management, and operation of water resources systems. For example, precise forecasts are required for flood damage mitigation, food production, navigation management, and environmental protections. In the past, many efforts have been put into the identification of streamflow process, which is a complex phenomenon and typically modeled using either conceptual methods (e.g., Duan et al., 1992; Aksoy and Bayazit, 2000; Suribabu and Bhaskar, 2015) or SC techniques (e.g., Toth, 2009; Wang et al., 2009; Danandeh Mehr and Demirel, 2016; Yaseen et al., 2017; Danandeh Mehr et al., 2018). The problem gets more complicated when the flow pattern in the rivers is intermittent, i.e., a river/stream experiences dry spells occasionally (Danandeh Mehr et al., 2018). The dry spells take place frequently in the intermittent rivers, particularly in the mountainous tributaries snow-fed streams (Danandeh Mehr et al., 2018). Owing to the low-density gauging networks in mountainous regions, one may not be able to model streamflow process conceptually (Danandeh Mehr et al., 2014a). In such cases, if a reliable dataset of streamflow records is available for a stream gauging station, use of SC techniques is a good alternative to model and forecast the streamflow time series. Such an approach is typically called univariate modeling where only the antecedent streamflow records are employed to form a predictive model. Selection of an appropriate SC technique, identifying the most effective inputs (at certain lags), and being cautious about the common overfitting problem of the SC tools are some of the vital issues that a modeler must consider in univariate streamflow modeling. This chapter provides an overview of the GP and discusses its advancements that are made in the recent past. Then, a brief review of the most recent (2011e20) studies of univariate streamflow forecasting is presented to demonstrate how GP variants were effectively utilized to forecast streamflow at different lead times. Ultimately, calibration and verification processes of two GP variants, namely classical GP and gene expression programming (GEP), for 1-month ahead streamflow forecast in the Sedre Stream, a mountainous river in Antalya Basin, Turkey, are illustrated.

Genetic programming for streamflow Chapter | 7

195

7.2 Overview of genetic programming and its variants 7.2.1 Classical genetic programming GP is a domain-independent, problem-solving approach in which computer programs are evolved to find solutions to problems (Koza, 1992; Tanev et al., 2005). The GP algorithm is based on the Darwinian principle of “survival of the fittest” (Koza, 1992). The computer programs are typically characterized by a tree structure known as genome. Fig. 7.1 illustrates a genome and corresponding mathematical equation using a root node (subtraction), inner nodes of addition, multiplication, and subtraction, and terminal nodes of x1, x2, and random numbers. Each node in a GP tree can adopt a function or terminal variables such as x1 and x2 in Fig. 7.1. Generally, the main steps followed in the GP-based forecast modeling are the selection of (i) a set of appropriate functions, (ii) input variables, and (iii) maximum depth (also referred to as height) of GP trees (Hrnjica and Danandeh Mehr, 2019). The wisdom decision taken during these steps not only supports the computational algorithm to reach the more accurate solutions but also helps to avoid overfitting and complicated solutions. The GP algorithm regardless of the type of problem begins with creation of the initial population of programs called potential solutions. Then, the solutions that show high performance during training phase survive to the next generation of population where they are considered as parents to create offspring. To this end, typically three evolutionary operators namely reproduction, crossover, and mutation are used that act on the genomes (Hrnjica and Danandeh Mehr, 2019). Reproduction is the process of transferring the best

FIGURE 7.1 A tree-shaped genomes and corresponding mathematical representations.

196 Advances in Streamflow Forecasting

single solution into the new population set of offspring without any morph. The process commences with the goodness-of-fit assessment of the initial programs and ends after the identification of the best-fitted one. Crossover is an operation that needs two of the best solutions as parents and yields two offspring by replacing genetic material of the parents. Fig. 7.2 illustrates the crossover process in which a branch of parent 1 (red nodes) is replaced with a branch of parent 2 (blue nodes). The offspring are solutions that possess genetic materials of their parents. Many studies have shown that offspring fits to the training set better than their parents. Mutation is the third genetic operation in which genetic materials of a single parent are replaced with the new genetic materials (a new subtree) at the mutation point. As illustrated in Fig. 7.3, the subtree x1*sin x2 in the parent gave his place to the randomly created log x2 subtree in the offspring. The aforementioned operations are repeated using the population of offspring as the new set of parents until an individual shows a desired level of

FIGURE 7.2 An example of crossover operation.

Genetic programming for streamflow Chapter | 7

197

FIGURE 7.3 Mutation operation acts on a genetic programming chromosome.

fit to the training dataset. If the individual shows favorite accuracy for the testing data, then it is called the best solution and the algorithm can be terminated.

7.2.2 Multigene genetic programming Multigene genetic programming (MGGP) is a robust GP variant that combines low-depth GP trees to increase the fitness of the best solutions. The best multigene solution is calculated by the linear sum of weighted single genes plus a constant value. Thus, an MGGP simplified model can be mathematically expressed in (Hrnjica and Danandeh Mehr, 2019) Eq. (7.1): y ¼ c0 þ c1 Gene1 þ c2 Gene2 þ . þ ci Genei

(7.1)

where Gene i is the mathematical function of the ith gene evolved by classical GP for the problem, c0 is irregular term (noise), and c1, c2, ., ci are the regression coefficients attained by least square optimization technique. Details of MGGP and its applications in hydrological studies can be found in Danandeh Mehr et al. (2018).

7.2.3 Linear genetic programming Linear genetic programming (LGP) is another form of GP in which genetic operators act on a linear strip instead of a tree-shaped gene (Uyumaz et al., 2014; Danandeh Mehr and Sorman, 2018). In other words, the program is expressed in a one-dimensional row of functions, terminals, and constants (if any). This structure has been mainly suggested to accelerate the runtime and achieve more proper solutions. In Fig. 7.4, each of the instructions includes a function that accepts a minimum number of constants or memory variables with initial value of zero, called register (r). The r value consecutively changes at each instruction and the result is equal to the r value in the last instruction. More details about LGP and its applications in hydrological studies can be found in Danandeh Mehr et al. (2018).

198 Advances in Streamflow Forecasting

FIGURE 7.4 Linear genetic programming representation with n instructions (upper panel) and linear sequence of four instructions acting on the register r0 that creates the output variable y ¼ 2x  3 (lower panel).

7.2.4 Gene expression programming GEP is a type of the commonly used GP variants that picks some of the developed subtrees to combine them for improving the model accuracy with respect to a given objective function (Ferreira, 2002). The best combination of subtrees can be found through trial and error procedure of several linking functions. The primary difference between GEP and GP algorithms is that GEP adopts a linear fixed-length representation of computer programs (Fig. 7.5), which can later be converted into a GP tree. Many studies have claimed the superiority of GEP over the traditional GP in terms of both efficiency and effectiveness (Ferreira, 2002). An initial program in GEP has the form of a string of fixed length composed of one or more genes called subexpressions (sub-ET), and the best result is built by linking the subexpressions using mathematical or Boolean logic functions such as AND, OR, NOT, etc. (Rahmani-Rezaeieh et al., 2020). Each subexpression is composed of two parts: (i) the head and (ii) the tail. The head contains functions and variables (terminals) and the tail is filled merely by terminals (Fig. 7.5). Like GP, a function accepts one or more arguments and returns a result after evaluation. To ensure the validity of a gene, the length of the head h is chosen, whereas the length of the tail t is determined using the following equation (Ferreira, 2002). t ¼ h  ðn  1Þ þ 1

(7.2)

where n is the maximum number of arguments of all predefined functions. In GEP, the best solution is obtained through the integration of the relevant sub-ETs using algebraic or Boolean logic functions. For example, a GEP model shown in Fig. 7.6 denotes the output variable y as a production of three subexpressions linked together with multiplication function. Mathematically, the model is presented as follows:   x2 þ x2 y ¼ sin ðx1 þ x2 Þ  (7.3)  ð2x1 Þ x1

FIGURE 7.5 The gene expression programming gene string representing (Sin (x1 þ x2)x1) ✕ 2x2/x1.

Genetic programming for streamflow Chapter | 7

199

FIGURE 7.6 A gene expression programming tree involving three sub-ETs linked by two multiplication functions.

where x1 and x2 are the input variables, and the multiplication function between the parentheses is called linking function. Details about evolutionary operations in GEP may be obtained from Ferreira (2002).

7.3 A brief review of the recent studies Different variants of GP have been used for streamflow prediction in recent years. The main reasons that put the GP methods widely in practice are their ability to evolve explicit mathematical functions and low computation cost (Rahmani-Rezaeieh et al., 2020). Considering the importance of accurate streamflow forecasts for the management of water resources, a brief review on the recent applications of classical GP and its advanced variants in univariate streamflow modeling is presented in this section. Salient relevant studies that used GP for univariate streamflow modeling were obtained from the web of science database published in the current decade 2011e19, and the same are listed in Table 7.1. There are a few studies reported prior to 2011 where satisfactory applications of GP variants were made (e.g., Sivapragasam et al., 2008; Guven, 2009; Wang et al., 2009). It is seen from Table 7.1 that the GP variants were used for both short-time (up to daily) and long-lead time forecasting. Among all the GP variants, the classical GP and GEP were the most frequently used tools for streamflow forecasting. The LGP method was used by Danandeh Mehr et al. (2013, 2014b) for month ahead prediction of monthly streamflow in the C¸oruh River, Turkey, using data from upstream station. Despite developing a regression model between two successive stations, the LGP model was trained with solely observed flows at an upstream gauging station. The results demonstrated superiority of the LGP over the ANNs and wavelet-ANN models. Oyebode et al. (2015) compared the LGP with ANN in order to find the best monthly streamflow prediction model with 1-year lead time for the upper Umkomazi River, South Africa. A good performance of the LGP is reported during the

200 Advances in Streamflow Forecasting

TABLE 7.1 Recent studies that implemented genetic programming (GP) for time series modeling of streamflow during the period 2011e20. S. No.

Authors

GP variant

Time scale

Country

1

Khatibi et al. (2011)

Genetic programming

Daily

Turkey

2

Toro et al. (2013)

Gene expression programming

Daily

Colombia

3

Danandeh Mehr et al. (2013)

Linear genetic programming

Monthly

Turkey

4

Danandeh Mehr et al. (2014b)

Linear genetic programming

Monthly

Turkey

5

Oyebode et al. (2015)

Linear genetic programming

Annual

South Africa

6

Karimi et al. (2016)

Gene expression programming

Daily, monthly

Turkey

7

Yadav et al. (2016)

Genetic programming

Hourly

Germany

8

Al-Juboori and Guven (2016)

Gene expression programming

Monthly

Turkey and Iraq

9

Mirzaei-Nodoushan et al. (2016)

Genetic programming

Monthly

Iran

10

Danandeh Mehr and Kahya (2017)

Multigene genetic programming

Daily

Turkey

11

Abdollahi et al. (2017)

Gene expression programming

Daily

Iran

12

Ravansalar et al. (2017)

Linear genetic programming

Monthly

Iran

13

Danandeh Mehr (2018)

Gene expression programming

Monthly

Iran

Danandeh Mehr and Sorman (2018)

Linear genetic programming

Monthly

Iran

14

Mehdizadeh and Sales (2018)

Gene expression programming

Monthly

Iran

15

Karimi et al. (2018)

Gene expression programming

Daily

China

16

Ghorbani et al. (2018)

Multigene genetic programming

Daily

Turkey

17

Zhang et al. (2018)

Genetic programming

Daily

China

Genetic programming for streamflow Chapter | 7

201

TABLE 7.1 Recent studies that implemented genetic programming (GP) for time series modeling of streamflow during the period 2011e20.dcont’d S. No.

Authors

GP variant

Time scale

Country

18

Mehdizadeh et al. (2019)

Gene expression programming

Monthly

Iran

19

Rahmani-Rezaieieh et al. (2020)

Gene expression programming

Daily

Iran

training and validation periods; however, performance of the ANN model was poor during validation period due to some under- or overestimation of the observed streamflow values. Karimi et al. (2016) suggested wavelet-GEP model to forecast daily and monthly streamflow in the Filyos River, Turkey. The results showed higher performance of wavelet-GEP to ANN and adaptive neuro-fuzzy inference system models. The GEP was also suggested by Al-Juboori and Guven (2016) to develop stepwise streamflow prediction model in monthly time horizon. The promising results were reported for monthly streamflow forecasts in the Hurman (Turkey), the Diyala, and the Lesser Zab River (Iraq). Karimi et al. (2018) compared the predictive efficiency of GEP and support vector machine (SVM) technique for daily streamflow forecasting in four gauging stations on the Heihe River, China. Results of three different input configurations (univariate models) revealed that the performance of GEP-based models was better than the SVM-based models for prediction of 1-day ahead streamflow. Wavelet-GEP was also used by Abdollahi et al. (2017) to investigate the effect of flow patterns and combined precipitation and flow patterns as modeling inputs for daily streamflow prediction in some mountain rivers in Iran. The study indicated that the GEP was better than ANN in streamflow prediction. However, wavelet-ANN model was slightly superior to wavelet-GEP model. Besides, comparing to the univariate models, the use of precipitation patterns did not increase the efficiency of the models remarkably. More recently, Danandeh Mehr et al. (2018) applied GEP to investigate the inputeoutput mapping capability of standalone GEP method for single-station monthly streamflow modeling in Shavir Stream, an intermittent river in northwest of Iran. Various GEP setups were developed and validated using streamflow records. The efficiency results of the evolved GEP-based models were compared with those of conventional GP as well as multilinear regression models developed in the study as the benchmarks. The study concluded that standalone GEP is not capable to capture nonlinear behavior of intermittent streamflow series. Coupling genetic algorithm to optimize GEP solutions showcased the better predictive accuracy than the standalone GEP.

202 Advances in Streamflow Forecasting

In most recent studies, MGGP and its hybrid forms were suggested for both univariate and multivariate streamflow forecasting. For example, Danandeh Mehr and Kahya (2017) showed that integration of moving average data preprocessing with MGGP engine may decrease timing error of univariate standalone MGGP models. The study suggested that Pareto optimal solution can be considered as a parsimonious daily streamflow model. In another study, integration of chaos theory with MGGP is found superior to MGGP alone, and hence, this integrated approach can be satisfactorily used for daily streamflow forecasting (Ghorbani et al., 2018).

7.4 A case study 7.4.1 Study area and data Fig. 7.7 shows the flowchart for applying GP and GEP in univariate streamflow forecasting. Application of GP/GEP for univariate streamflow forecasting is demonstrated through a case study where streamflow is forecasted for the

FIGURE 7.7 Flowchart for applying genetic programming (GP) and gene expression programming (GEP) for univariate streamflow forecasting.

Genetic programming for streamflow Chapter | 7

203

FIGURE 7.8 Antalya Basin and location of the sedre stream catchment in the basin.

Sedre River, located in Antalya Basin, Turkey (Fig. 7.8). The Antalya Basin covers approximately 19.5 km2, which is about 2.5% territory of Turkey. The basin is surrounded by Sultan Mountains in the north, Alanya District and Taurus Mountains in the east, Beyda gları and Katrancık Mountains in the west, and bounded by the Gulf of Antalya in the south. There are 11 main rivers (from west to east including Bo gac¸ay, Du¨den, Aksu, Ko¨pru¨c¸ay, Manavgat, Karpuz, Alara, Kargı, Obac¸ay, Dim, and the Sedre Stream) and many lakes such as E girdir and Karacao¨ren Dam reservoirs in Antalya Basin, which makes it one of the richest regions in terms of water resources of Turkey. The Sedre Stream, locally known as Sapadere canyon, springs from Toros Mountains at the elevation of 950 m and reaches to Mediterranean Sea in Alanya City of Antalya Province after a course of 25 km. Daily streamflow data of the Sedre Stream are recorded since 1988 at Sedre gauging station (36 260 40.750 N, 32 120 50.500 E) by State Hydraulic Works of Turkey. The long-term mean annual streamflow at the station is about 2.75 m3/s. The monthly streamflow time series for the 1987e2015 period is presented in Fig. 7.9. Of the total 334 streamflow data, first 70% and last 30% of the observations were separated for training and testing of GP models, respectively. Table 7.2 represents the statistical features of the entire data and subdata. It should be noted that we rescaled the input/output data to the range [0.0e1.0] by the well-known minimumemaximum normalization method before subjecting dataset to modeling setup.

204 Advances in Streamflow Forecasting

FIGURE 7.9 Mean monthly streamflow of the sedre stream during 1988e2015.

7.4.2 Criteria for evaluating performance of models After the training of GP variants, the model that produces the best result in terms of coefficient of efficiency (R2) and root mean square error (RMSE) measures is chosen as the best solution. The performance evaluation criteria chosen in this study have been widely used in the literature studies tabulated in Table 7.1. Mathematical expressions of the chosen criteria are given as 2 Pn  obs X  Xipre R2 ¼ 1  Pni¼1 i (7.4)  obs  X obs 2 mean i¼1 Xi sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pn  obs pre 2 i¼1 X i  X i (7.5) RMSE ¼ n where Xiobs , observed monthly streamflow; Xipre , predicted monthly streamobs , mean monthly streamflow, and n is the number of observations. flow; Xmean

7.5 Results and discussion Optimal number of input vectors (lags) may lead GP/GEP to generate a robust model. Conversely, inadequate or extra inputs may yield unwell or complicated models (Danandeh Mehr et al., 2018). Typically, autocorrelation function (ACF) and partial autocorrelation function (PACF) of the given time series are used for input identification in the time series modeling. Although this is done based on linear correlation among the present and past streamflow series, the method can be used to select some initial inputs. Then, using the inherent

Data set

Number of data

Minimum (m3/s)

Maximum (m3/s)

Mean (m3/s)

Standard deviation (m3/s)

Coefficient of variation

Skewness

Entire dataset

334

0

35.5

2.77

4.2

1.52

3.16

Training dataset

324

0.003

35.5

2.78

4.4

1.57

3.52

Testing dataset

100

0

17.0

2.76

3.8

1.37

1.88

Genetic programming for streamflow Chapter | 7

TABLE 7.2 Basic statistics of observed streamflow in the sedre stream during 1988e2015.

205

206 Advances in Streamflow Forecasting

evolutionary search methods of GP, the best input vectors can be identified. To this end, firstly, we plotted the correlogram of the streamflow series and visually inspected the pattern of ACF and PACF. Then, we used GP/GEP to find the optimal number of lags. Fig. 7.10 shows the ACF, PACF, and the corresponding 95% confidence levels of the time series for a lag range of 0e60 months at the station. It is seen that the ACF contains an oscillating pattern for almost a 12-month period. This means that monthly streamflow at the station is more correlated to its preceding year value than that of former month. Furthermore, the PACF revealed that the serial correlation is not strong after a year. Therefore, the streamflow Qt in the Seder Stream being modeled for a given month is designated as a function of river flow at 1- and 12-month preceding streamflow. The mathematical form of the 1-month lead time model can be expressed as Qt ¼ f ðQt1 ; Qt12 Þ

(7.6)

The streamflow data at time t-1 and t-12 as well as a set of random floating point numbers between 0 and 1 were chosen as the members of the terminal set for both GP/GEP setup. To create classical GP- and GEP-based models for 1-month lead time, software GPdotNET v5.0 and GeneXproTools v5.0 (developed by Hrnjica and Danandeh Mehr, 2019) were used, respectively. The former software is freely available, whereas the latter one is commercial. To control the complexity of the solutions, only basic arithmetic operations (addition, subtraction, multiplication, and division) were used as the members of functional set. Regarding the evolutionary parameters (i.e., rate of crossover, mutation, and reproduction), we performed several initial runs and figured out suitable rates for evolutionary operators. The RMSE was used as the objective function to train both GP and GEP models. The average fitness

FIGURE 7.10 Autocorrelation function (ACF) and partial autocorrelation function (PACF) diagrams for observed streamflow series in the sedre stream.

Genetic programming for streamflow Chapter | 7

207

during generations was monitored to stop the run as suggested by Danandeh Mehr et al. (2018). To this end, two criteria were considered: (i) the ongoing run would be stopped when GP starts to create weaker solutions, and (ii) the ongoing runs would be stopped if extra generations up to 200 generations do not improve the performance of available model. These strategies can help the modeler to avoid overfitting problem. The other parameters/methods used to setup GP/GEP runs in this study are summarized in Table 7.3. Initial trials with both GP and GEP models, algorithms could use trigonometric, square root, and exponential functions as a challenge. However, the results showed that use of complex functions in the functional set not only increases complexity in the solutions but also leads the GP and GEP modeling to overtrain the models that too after a limited number of generations. Consequently, the derived models were not efficient for testing periods. On the other hand, when the arithmetic operations were used, the best model was developed during the first 200 generations only. Results of the comparative performances of the best evolved monolithic GP and GEP models are presented in Table 7.4. According to the results, both the GP and GEP algorithms provide more or less the same prediction accuracy with the higher correlation between the model-predicted and observed data for the testing period. The R2-value for the best GEP model equals to 0.46 with the least forecasting error of 2.6 m3/s. Although the models exhibited the better performance for the testing period, it must not be implied that the models were not trained well. The main reason is relatively a smaller number of data chosen for the testing period and higher standard deviation of the observed data for the training period due to the appearance of the global peak flow in this period. As previously mentioned,

TABLE 7.3 Parameters selected for the genetic programming (GP) and gene expression programming (GEP) setup. Parameter

GP

GEP

Population size

500

500

Initialization

Half and half

Half and half

Maximum initial level

5

1

Maximum tree depth

6

6

Selection method

Tournament

Tournament

Linking function

e

Addition

Head size

e

7

208 Advances in Streamflow Forecasting

TABLE 7.4 Values of goodness-of-fit criteria for the best genetic programming (GP) and gene expression programming (GEP) models used in forecasting of monthly streamflow in the sedre stream. Root mean square error (m3/s)

Coefficient of determination (R2)

Model

Training

Testing

Training

Testing

GP

3.261

2.819

0.383

0.458

GEP

3.268

2.601

0.380

0.460

higher levels of program size or use of more complex functions in the GP/GEP model setup lead to the better model performance in training period but the models get overtrained and provide exceptionally weak results in the testing period, particularly in case of the R2 criterion. Thus, the hypothesis of “more complex functions may evolve more efficient solutions” was rejected in the current study. The streamflow series predicted by GP and GEP models were plotted and compared with the observed streamflow series during both the training and testing periods (Fig. 7.11). In general, it is seen that the models can capture the strong periodicity of the observed streamflow series, but they suffer from inadequate precision in estimation of high/peak magnitudes of streamflow as well as a pronounced lag or timing error. This result agrees with that of earlier studies dealing with univariate streamflow that reported lagged prediction problems of standalone GP models (e.g., Danandeh Mehr and Kahya, 2017; Rahmani-Rezaeieh et al., 2020). Such weakness at estimating peak streamflow values might relevant to the existence of strong deviation at the streamflow records in the range [0, 35.5] that makes the prediction of the high/peak streamflow excessively difficult. On the other hand, lagged predictions may originate from rather high autoregressive feature between the target variable Qt and its preceding value (i.e., Qt-1) so that GP (GEP) gives more weights to the genes (sub-ETs) having the preceding streamflow values. Such drawback of the standalone GEP techniques has recently been reported by RahmaniRezaeieh et al. (2020) where GEP was implemented for time series modeling of daily streamflow in an intermittent stream in Iran. According to that study, ensemble GEP that combines power of GP and GEP may reduce time lag error of the standalone GEP, particularly when timing accuracy of high streamflow is of interest.

Genetic programming for streamflow Chapter | 7

209

FIGURE 7.11 Observed and genetic programming (GP)/gene expression programming (GEP) forecasted streamflow series at (A) training and (B) testing periods.

Tree expressions of the best-evolved GP and GEP models are shown in Fig. 7.12 in order to investigate the frequency of input variables in the best-evolved GEP/GP models. The parameters Ri and Ci denote random constants given in the terminal nodes of the GP and GEP models, respectively. The Qt-12 and Qt-1 are the normalized values of the observed streamflow. The corresponding mathematical expressions of each model are also provided in Fig. 7.12. In the GP (GEP) model, Qt1 and Qt12 appeared six (five) and three (four) times, respectively. Therefore, Qt1 could be considered as the most dominant variable between the prespecified predictors. The result is consistent with the aforementioned discussion in which the appearance of lagged predictions was associated to the high autoregressive feature between the predicted value (i.e., Qt) and its preceding value (i.e., Qt1).

210 Advances in Streamflow Forecasting FIGURE 7.12 Tree expression of the (A) best genetic programming and (B) gene expression programming models established for 1-month ahead streamflow forecasting in the sedre stream.

Genetic programming for streamflow Chapter | 7

211

7.6 Conclusions This chapter presented a review of the current studies dealing with applications of GP variants in univariate streamflow forecasting over different parts of the world. Many studies have proved that GP and its variants performed better than classical time series models or even ANNs, SVMs, or other data-driven models. However, standalone GP-based models could not remove the phase lag satisfactorily in univariate time series forecasts. This chapter also investigated the ability of standalone GP and GEP techniques for 1-month ahead streamflow forecasting at the Sedre Stream in Antalya Basin of Turkey. The GP and GEP models were trained and validated using the mean monthly streamflow data of 28-year period recorded at the river. Performance of the models was compared by employing different goodness-of-fit criteria. The observed streamflow data were normalized using minimumemaximum normalization method before training of the models to deserve dimensionally consistent models. The comparative performance evaluation of both the models showed a medium to high capability of both GP and GEP models in forecasting low and medium monthly streamflow data. However, it was revealed that the models were not capable enough to forecast annual maxima/ peak streamflow and their robustness for generalization was typically poor. Furthermore, the results indicated that the models frequently run into overfitting problem when excess degrees of freedom were allowed for the creation of a primitive functional set. These results bring into question of common beliefs about GP/GEP training regarding optimal function set and overfitting problem and suggest lower degrees of freedom to cope with the problem. Referring to the case study results, it can be concluded that the GEP model was slightly better than the GP model in forecasting streamflow. The outcomes also revealed that adding more complicated functions or an increase in the number or depth of genes would not necessarily augment the model performance. In contrast, it may lead the GP/GEP to overtrain during the initial generations. Thus, implementation of only simple arithmetic operators during the initial trials of the GP/GEP runs, even for such high nonlinear and periodic time series is strongly recommended. The quantitative values of the performance evaluation criteria during the validation period indicated that neither standalone GP nor GEP was able to attain a high level of accuracy for estimating the peak values of streamflow. This drawback was attributed to the extreme periodic structure of the streamflow series as was pronounced in the associated ACF plot. Returning to the literature, it should be reminded that neither standalone ANN nor SVM was also able to forecast monthly streamflow data with a desired level of accuracy in the presence of strong deviation at intermittent streamflow series. These results also rightly emphasize the necessity of further effort to improve forecasting accuracy of SC tools, which might be accessible via hybridization strategy such as wavelet-GP or wavelet-GEP. Moreover, performance of the

212 Advances in Streamflow Forecasting

adopted tools could be investigated for streamflow forecasts with higher lead times. In a similar way as is adopted in the case study, future studies would also consider other GP variants such as stack- or grammar-based GP that has never been used for monthly streamflow prediction at intermittent rivers.

References Abdollahi, S., Raeisi, J., Khalilianpour, M., Ahmadi, F., Kisi, O., 2017. Daily mean streamflow prediction in perennial and non-perennial rivers using four data driven techniques. Water Resour. Manag. 31 (15), 4855e4874. https://doi.org/10.1007/s11269-017-1782-7. Aksoy, H., Bayazit, M., 2000. A daily intermittent streamflow simulator. Turk. J. Eng. Environ. Sci. 24 (4), 265e276. Al-Juboori, A.M., Guven, A., 2016. A stepwise model to predict monthly streamflow. J. Hydrol. 543, 283e292. https://doi.org/10.1016/j.jhydrol.2016.10.006. Danandeh Mehr, A., Kahya, E., Olyaie, E., 2013. Streamflow prediction using linear genetic programming in comparison with a neuro-wavelet technique. J. Hydrol. 505, 240e249. https:// doi.org/10.1016/j.jhydrol.2013.10.003. Danandeh Mehr, A., Kahya, E., Bagheri, F., Deliktas, E., 2014. Successive-station monthly streamflow prediction using neuro-wavelet technique. Earth Sci. India 7 (4), 217e229. https:// doi.org/10.1007/s12145-013-0141-3. Danandeh Mehr, A., Kahya, E., Yerdelen, C., 2014. Linear genetic programming application for successive-station monthly streamflow prediction. Comput. Geosci. 70, 63e72. https://doi.org/ 10.1016/j.cageo.2014.04.015. Danandeh Mehr, A., Demirel, M.C., 2016. On the calibration of multigene genetic programming to simulate low flows in the Moselle River. Uludag Univer. J. Faculty Eng. 21 (2), 365e376. https://doi.org/10.17482/uumfd.278107. Danandeh Mehr, A., Kahya, E., 2017. A Pareto-optimal moving average multigene genetic programming model for daily streamflow prediction. J. Hydrol. 549, 603e615. https://doi.org/10. 1016/j.jhydrol.2017.04.045. Danandeh Mehr, A., 2018. An improved gene expression programming model for streamflow forecasting in intermittent streams. J. Hydrol. 563, 669e678. https://doi.org/10.1016/j.jhydrol. 2018.06.049. Danandeh Mehr, A., Sorman, A.U., 2018. Streamflow and sediment load prediction using linear genetic programming. Uludag Univer. J. Faculty Eng. 23 (2), 323e332. https://doi.org/10. 17482/uumfd.352833. Danandeh Mehr, A., Nourani, V., 2018. Season algorithm-multigene genetic programming: a new approach for rainfall-runoff modelling. Water Resour. Manag. 32 (8), 2665e2679. https://doi. org/10.1007/s11269-018-1951-3. Danandeh Mehr, A., Jabarnejad, M., Nourani, V., 2019. Pareto-optimal MPSA-MGGP: a new gene-annealing model for monthly rainfall forecasting. J. Hydrol. 571, 406e415. https://doi. org/10.1016/j.jhydrol.2019.02.003. Danandeh Mehr, A., Nourani, V., Kahya, E., Hrnjica, B., Sattar, A.M., Yaseen, Z.M., 2018. Genetic programming in water resources engineering: a state-of-the-art review. J. Hydrol. 566, 643e667. https://doi.org/10.1016/j.jhydrol.2018.09.043. Duan, Q., Sorooshian, S., Gupta, V., 1992. Effective and efficient global optimization for conceptual rainfall-runoff models. Water Resour. Res. 28 (4), 1015e1031. https://doi.org/10. 1029/91WR02985.

Genetic programming for streamflow Chapter | 7

213

Ferreira, C., 2002. Gene expression programming in problem solving. In: Roy, R., Ko¨ppen, M., Ovaska, S., Furuhashi, T., Hoffmann, F. (Eds.), Soft Computing and Industry: Recent Applications. Springer, London, pp. 635e653. https://doi.org/10.1007/978-1-4471-0123-9_54. Ghorbani, M.A., Khatibi, R., Mehr, A.D., Asadi, H., 2018. Chaos-based multigene genetic programming: a new hybrid strategy for river flow forecasting. J. Hydrol. 562, 455e467. https://doi.org/10.1016/j.jhydrol.2018.04.054. Guven, A., 2009. Linear genetic programming for time-series modelling of daily flow rate. J. Earth Syst. Sci. 118 (2), 137e146. https://doi.org/10.1007/s12040-009-0022-9. Hrnjica, B., Danandeh Mehr, A., 2019. Optimized Genetic Programming Applications: Emerging Research and Opportunities. IGI Global, Hershey, PA, p. 310. Huang, Y., Lan, Y., Thomson, S.J., Fang, A., Hoffmann, W.C., Lacey, R.E., 2010. Development of soft computing and applications in agricultural and biological engineering. Comput. Electron. Agric. 71 (2), 107e127. https://doi.org/10.1016/j.compag.2010.01.001. Jabeen, H., Baig, A.R., 2010. Review of classification using genetic programming. Int. J. Eng. Sci. Technol. 2 (2), 94e103. ¨ ., 2011. Comparison of three artificial intelKhatibi, R., Ghorbani, M.A., Kashani, M.H., Kis¸i, O ligence techniques for discharge routing. J. Hydrol. 403 (3), 201e212. https://doi.org/10.1016/ j.jhydrol.2011.03.007. ¨ ., Shiri, A.A., 2016. Short-term and long-term streamflow prediction by Karimi, S., Shiri, J., Kis¸i, O using waveletegene expression programming approach. ISH J. Hydraulic Eng. 22 (2), 148e162. https://doi.org/10.1080/09715010.2015.1103201. ¨ ., Xu, T., 2018. Forecasting daily streamflow values: assessing heuristic Karimi, S., Shiri, J., Kis¸i, O models. Nord. Hydrol. 49 (3), 658e669. https://doi.org/10.2166/nh.2017.111. Koza, J.R., 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection, vol. 1. MIT press, USA, p. 819. Mehdizadeh, S., Fathian, F., Safari, M.J.S., Adamowski, J.F., 2019. A comparative assessment of time series and artificial intelligence models for estimating monthly streamflow: local and external data analyses approach. J. Hydrol. 124225. https://doi.org/10.1016/ j.jhydrol.2019.124225. Mehdizadeh, S., Sales, A.K., 2018. A comparative study of autoregressive, autoregressive moving average, gene expression programming and Bayesian networks for estimating monthly streamflow. Water Resour. Manag. 32 (9), 3001e3022. https://doi.org/10.1007/s11269-0181970-0. Mirzaei-Nodoushan, F., Bozorg-Haddad, O., Fallah-Mehdipour, E., Loa´iciga, H.A., 2016. Application of data mining tools for long-term quantitative and qualitative prediction of streamflow. J. Irrigat. Drain. Eng. 142 (12), 04016061. https://doi.org/10.1061/(ASCE)IR. 1943-4774.0001096. Obiedat, R., Alkasassbeh, M., Faris, H., Harfoushi, O., 2013. Customer churn prediction using a hybrid genetic programming approach. Sci. Res. Essay. 8 (27), 1289e1295. https://doi.org/10. 5897/SRE2013.5559. Oyebode, O.K., Adeyemo, J.A., Otieno, F.A.O., 2015. Comparison of two data-driven modelling techniques for long-term streamflow prediction using limited datasets. J. S. Afr. Inst. Civ. Eng. 57 (3), 9e19. https://doi.org/10.17159/2309-8775/2015/V57N3A2. Rahmani-Rezaeieh, A., Mohammadi, M., Mehr, A.D., 2020. Ensemble gene expression programming: a new approach for evolution of parsimonious streamflow forecasting model. Theor. Appl. Climatol. 139 (1e2), 549e564. https://doi.org/10.1007/s00704-019-02982-x.

214 Advances in Streamflow Forecasting Ravansalar, M., Rajaee, T., Kisi, O., 2017. Wavelet-linear genetic programming: a new approach for modeling monthly streamflow. J. Hydrol. 549, 461e475. https://doi.org/10.1016/j.jhydrol. 2017.04.018. Roy, R., Furuhashi, T., Chawdhry, P.K. (Eds.), 2012. Advances in Soft Computing: Engineering Design and Manufacturing. Springer Science & Business Media, p. 638. https://doi.org/10. 1007/978-1-4471-3744-3. Safari, M.J.S., Mehr, A.D., 2018. Multigene genetic programming for sediment transport modeling in sewers for conditions of non-deposition with a bed deposit. Int. J. Sediment Res. 33 (3), 262e270. https://doi.org/10.1016/j.ijsrc.2018.04.007. Sattar, A.M.A., Gharabaghi, B., Sabouri, F., Thompson, A.M., 2017. Urban stormwater thermal gene expression models for protection of sensitive receiving streams. Hydrol. Process. 31 (13), 2330e2348. https://doi.org/10.1002/hyp.11170. Sivapragasam, C., Maheswaran, R., Venkatesh, V., 2008. Genetic programming approach for flood routing in natural channels. Hydrol. Process. 22 (5), 623e628. https://doi.org/10.1002/hyp.6628. Suribabu, C.R., Bhaskar, J., 2015. Evaluation of urban growth effects on surface runoff using SCSCN method and Green-Ampt infiltration model. Earth Sci. India 8 (3), 609e626. https://doi. org/10.1007/s12145-014-0193-z. Tanev, I., Brzozowski, M., Shimohara, K., 2005. Evolution, generality and robustness of emerged surrounding behavior in continuous predators-prey pursuit problem. Genet. Program. Evolvable Mach. 6 (3), 301e318. https://doi.org/10.1007/s10710-005-2989-6. Toro, C.H.F., Meire, S.G., Ga´lvez, J.F., Fdez-Riverola, F., 2013. A hybrid artificial intelligence model for river flow forecasting. Appl. Soft Comput. 13 (8), 3449e3458. https://doi.org/10. 1016/j.asoc.2013.04.014. Toth, E., 2009. Data-driven streamflow simulation: the influence of exogenous variables and temporal resolution. In: Abrahart, R.J., See, L.M., Solomatine, D.P. (Eds.), Practical Hydroinformatics, Water Science and Technology Library, vol. 68. Springer, Berlin, Heidelberg, pp. 113e125. https://doi.org/10.1007/978-3-540-79881-1_9. Uyumaz, A., Danandeh Mehr, A., Kahya, E., Erdem, H., 2014. Rectangular side weirs discharge coefficient estimation in circular channels using linear genetic programming approach. J. Hydroinf. 16 (6), 1318e1330. https://doi.org/10.2166/hydro.2014.112. Wang, W.C., Chau, K.W., Cheng, C.T., Qiu, L., 2009. A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series. J. Hydrol. 374 (3), 294e306. https://doi.org/10.1016/j.jhydrol.2009.06.019. Yadav, B., Ch, S., Mathur, S., Adamowski, J., 2016. Discharge forecasting using an online sequential extreme learning machine (OS-ELM) model: a case study in Neckar River, Germany. Measurement 92, 433e445. https://doi.org/10.1016/j.measurement.2016.06.042. Yaseen, Z.M., Ebtehaj, I., Bonakdari, H., Deo, R.C., Mehr, A.D., Mohtar, W.H.M.W., et al., 2017. Novel approach for streamflow forecasting using a hybrid ANFIS-FFA model. J. Hydrol. 554, 263e276. https://doi.org/10.1016/j.jhydrol.2017.09.007. Zhang, Z., Zhang, Q., Singh, V.P., Shi, P., 2018. River flow modelling: comparison of performance and evaluation of uncertainty using data-driven models and conceptual hydrological model. Stoch. Environ. Res. Risk Assess. 32 (9), 2667e2682. https://doi.org/10.1007/ s00477-018-1536-y.

Chapter 8

Model tree technique for streamflow forecasting: a case study in sub-catchment of Tapi River Basin, India Priyank J. Sharma1, P.L. Patel1, V. Jothiprakash2 1 Department of Civil Engineering, Sardar Vallabhbhai National Institute of Technology, Surat, Gujarat, India; 2Department of Civil Engineering, Indian Institute of Technology Bombay, Mumbai, Maharashtra, India

8.1 Introduction A model, in simple terms, is defined as the representation of real-world phenomenon with the objective of explanation or prediction (Solomatine, 2005). In hydrologic and hydraulic applications, the models can be commonly classified as (i) physical-scaled models, (ii) mathematical or process-based models, and (iii) empirical or statistical models (Solomatine, 2005; Solomatine and Ostfeld, 2008). The mathematical models, which simulate the physical processes in the system, are often known as knowledge-driven or physically based models (Solomatine, 2005; Bourdin et al., 2012). The empirical models, on other hand, map the relationship between the inputs and outputs without explicitly accounting the underlying physics of the hydrological processes. Such models are often referred to as data-driven models (DDMs) (Solomatine, 2005; Bourdin et al., 2012). The process-based hydrological models generalize the understanding of various phenomena involved in the hydrological cycle (system). Such models simulate the flow processes in a watershed, by solution of a system of partial differential equations, which best describe the watershed characteristics. In hydrological applications, they are typically distinguished as (i) lumped conceptual models and (ii) distributed processebased models (Solomatine, 2005). The lumped conceptual models are easier to operate and are less dataintensive. The conceptual models primarily represent transformation from rainfall to runoff; however, they do not consider the spatial variability of the Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00001-9 Copyright © 2021 Elsevier Inc. All rights reserved.

215

216 Advances in Streamflow Forecasting

hydrological processes (Perrin et al., 2001). The distributed processebased models describe the system through mathematical equations involving continuity, energy, and momentum principles (Solomatine, 2005). Such models incorporate spatial and temporal variations within the catchment; however, they are highly data-intensive and require high computational power. The data-driven or black-box models consider the hydrological system as a black box and do not consider the underlying physical process behind changes in the parameter and their behaviors within the system. The DDMs attempt to generalize a relationship between the past hydrologic inputs such as rainfall, evaporation, temperature, and the desired output, i.e., runoff at the outlet. The DDMs involve a combination of statistical, soft computing, artificial intelligence, and machine learning techniques (Solomatine and Ostfeld, 2008). Some of the popular DDMs include artificial neural networks (ANNs), fuzzy ruleebased approaches, tree-based models, evolutionary algorithms, etc. (Solomatine and Ostfeld, 2008; Oyebode et al., 2014). The distributed processebased models offer an accurate estimation of the hydrological processes under the condition of adequate data availability at finer spatial scale and on longer time length. On the other hand, the DDMs are mostly useful to model the watershed processes under limited data availability conditions (Iorgulescu and Beven, 2004). The usefulness of the DDMs to ascertain the behavior of a physical system has been well-documented in the literature (e.g., Han et al., 2007; Galelli and Castelletti, 2013). Distinct advantages of the DDMs, such as ease of development, less data- and time-intensive, and low computational requirements, have made them popular over the process-based models among the hydrologists, water resources researchers, and managers. One such data-driven technique, based on the principle of information theory, viz., model tree (MT), is discussed at length.

8.2 Model tree The MT is a state-of-the-art data-driven hierarchical algorithm used to assess the relation between input and output parameters (Quinlan, 1992; Solomatine and Dulal, 2003). The MTs are considered to be further extensions of the classification and regression trees, wherein the former has numeric values associated with the leaves (Oyebode et al., 2014; Rezaie-Balf et al., 2017). The MT divides the input parameter space into various subspaces by sorting them down the tree from the root node to the leaf nodes and formulating a linear multivariable regression model for each of them (Witten and Frank, 2005; Rezaie-Balf et al., 2017). Thus, the MT models adopt piecewise linear functions to approximate nonlinear relationships (Jothiprakash and Kote, 2011b; Rezaie-Balf et al., 2019). For instance, a schematic representation of MT technique involving splitting the input parameter space (X1X2) into five subspaces is shown in Fig. 8.1A. The generalized tree-like structure resulting due to splitting of the parameter space, wherein each leaf represents a linear regression model for

Model tree technique for streamflow forecasting Chapter | 8

217

FIGURE 8.1 (A) Schematic representation of splitting the input parameter space and (B) constructing tree-based linear regression functions by model tree algorithm.

each subspace, is shown in Fig. 8.1B. The splitting process in the MT approach employs linear regression functions at the terminal nodes (leaves) instead of discrete class values, which enables the prediction of continuous numerical attributes (Goyal, 2014). The splitting criterion for the MT algorithm considers standard deviation (sd) of the class values reaching a node as a measure of the error at that node and evaluating the expected error reduction as a result of testing each attribute at that node. The standard deviation reduction (SDR) is expressed as (Witten and Frank, 2005) SDR ¼ sdðTÞ 

X jTi j i

jTj

sdðTi Þ

(8.1)

where T represents a set of instances that reaches the leaf node and Ti represents the datasets resulting from splitting the node as per the chosen attribute.

218 Advances in Streamflow Forecasting

After exploring all possible splits (i.e., the attributes and probable split values), M5 chooses the one that maximizes the SDR. The splitting process terminates when the class values of all the instances that reach a node vary minimally, or only a few instances remain. The relentless division often results in overfitting, and thus, the overgrown trees must be pruned back. In the pruning process, the inner nodes (or subtrees) are transformed into leaf nodes by replacing them with linear regression models. The discontinuity in the adjacent linear models, after pruning, is compensated by the smoothing process (Witten and Frank, 2005). The MT approach has emerged as a popular modeling tool for diverse applications in hydrologic studies such as flood forecasting (Solomatine and Xue, 2004), reservoir inflow prediction (Jothiprakash and Kote, 2011b), reservoir evaporation estimation (Arunkumar and Jothiprakash, 2012), reservoir sedimentation assessment (Garg and Jothiprakash, 2013), sediment yield prediction (Goyal, 2014), groundwater level prediction (Nalarajan and Mohandas, 2015), and reference evapotranspiration modeling (Keshtegar et al., 2019), among others. However, the current chapter is focused on application of MT in streamflow forecasting.

8.3 Model tree applications in streamflow forecasting The MT technique was first proposed by Quinlan (1992); however, Kompare et al. (1997) first demonstrated the application of MT in streamflow forecasting. Solomatine and Dulal (2003) later applied MT in modeling rainfallrunoff process as an alternative to ANN, wherein the ability of MT to resolve the prominent drawback of ANN technique was highlighted. Also, the model structure of the latter was found to be hidden and difficult to interpret, whereas the MT models demonstrated faster training capabilities and generated easily understandable and interpretable results. Thereafter, Solomatine and Xue (2004) applied MT for flood forecasting in the upper reaches of the Thai River, China. The prediction accuracies of MT and ANN models were found to be satisfactory; however, some inaccuracies were reported in the prediction of some peak flows. Therefore, a hybrid model was formulated by combining MT and ANN to improve the accuracy further. Thus, MT models were propounded to be helpful in parameter selection and assessing their interrelationships for conceptual hydrological models. Jothiprakash and Kote (2011a) investigated the influence of data preprocessing for the application of data-driven techniques in the prediction of inflows. The results indicated that application of MT does not require data transformation (such as normalization) for improvement in the model performance. Jothiprakash and Kote (2011b) also analyzed the effects of pruning and smoothing of MTs applied to predict intermittent flows into the reservoir. The analysis showed that unpruned and unsmoothed MT models performed better than other MT combinations in streamflow prediction.

Model tree technique for streamflow forecasting Chapter | 8

219

Zia et al. (2015) demonstrated the successful application of MT in a resource-constrained system with fewer samples and simpler parameters by adopting 10-fold cross-validation instead of specifying separate calibration and validation datasets. Yaseen et al. (2016) compared the performance of various heuristic regression techniques, viz., least square support vector regression (LSSVR), multivariate adaptive regression splines (MARS), and MT, for prediction of monthly streamflows. The LSSVR model outperformed MARS, MT, and multiple linear regression models with improvement in model accuracy by embedding the periodicity component. Rezaie-Balf et al. (2017) modeled the nonlinear rainfall-runoff process using ANN, MARS, and MT techniques considering daily observed climatic parameters. The obtained results indicated the superiority of MT model over ANN and MARS models, as the former provides mathematical expressions of the relationships which can be easily interpreted compared to the latter. To overcome this drawback for ANN and MARS models, the study recommended the use of mutual information methods to preprocess the inputs before using them in the model formation. Esmaeilzadeh et al. (2017) assessed the performances of ANN, support vector regression (SVR), wavelet-ANN (WANN), and MT techniques to predict inflows in Sattarkhan reservoir, Iran. The outputs of MT were easy to interpret in form of linear regression equations and could be easily employed by the dam operators to generate 1-day ahead inflow forecasts without acquiring superior mathematical skills as required in case of WANN, SVR, and ANN techniques. More et al. (2019) presented the applicability of MT to dataset having higher proportion of zero values and compared its performance with conventional autoregressive integrated moving average (ARIMA) method. The study advocated the adoption of MT technique for improved predictions at smaller time steps (viz., daily). Nourani et al. (2019) introduced a hybrid wavelet-M5 model to simulate the rainfall-runoff process, by adopting different data proportioning strategies, for daily and monthly timescales. The study inferred that the hybrid wavelet-MT model (i.e., multilinear model) yielded better multiestep-ahead predictions than nonlinear methods (i.e., WANN). The MTs offer advantages of exhibiting high convergence rate, counter overfitting problems through pruning operation, and partitioning of input space to improve model accuracy (Oyebode et al., 2014). Furthermore, in comparison to the regression trees, MTs are smaller in size and their regression functions do not comprise of large variables (Ajmera and Goyal, 2012). However, certain disadvantages are also listed, viz., inaccuracy resulting from partitioning input space when ratio of instances is smaller than number of attributes, higher computational demand to model data with high dimensionality, and physical interpretation of generated equations. To overcome these drawbacks associated with MT technique, several researchers (Nourani et al., 2019; Rezaie-Balf et al., 2019) have proposed hybrid-MT techniques, which involve data preprocessing before using it as an input parameter in the model.

220 Advances in Streamflow Forecasting

8.4 Application of model tree in streamflow forecasting: a case study A case study is presented in this section, where MT technique is applied for streamflow forecasting in Purna sub-catchment of Tapi Basin, India.

8.4.1 Study area The Tapi River Basin spans over an area of 65,145 km2 and is located in the western central India (Fig. 8.2). The Tapi Basin is geographically demarcated into three subbasins, i.e., upper, middle, and lower Tapi basins (Sharma et al., 2019). In the present study, the area under consideration is Purna subcatchment (area z 18,490 km2), which is a part of the upper Tapi Basin (Fig. 8.2B). The Purna sub-catchment up to Yerli stream gauging station is designated as “Yerli sub-catchment” with an area of about 15,881 km2. There are 17 rain gauge stations in the Yerli sub-catchment, where daily data of rainfall are recorded by the India Meteorological Department (IMD), Pune; location of the rain gauge stations is shown in Fig. 8.2C. The Purna River is gauged at Lakhpuri, Gopalkheda, and Yerli stations and monitored by the Tapi Division, Central Water Commission (CWC), Surat; location of three gauging stations is shown in Fig. 8.2C. The Purna River with a length of about 353 km is the longest tributary of the Tapi River, originating from Gawilgarh hills in Betul district of Madhya

FIGURE 8.2 Index map of study area. (A) India, (B) Tapi Basin, and (C) Yerli sub-catchment.

Model tree technique for streamflow forecasting Chapter | 8

221

Pradesh at an altitude of 900 m, and drains an area of 18,490 km2. The Purna River initially flows southward and then takes a westward turn before Lakhpuri stream gauging station. The Purna sub-catchment is bounded by Gwaligarh hills in the north, Ajanta hills in the south, and Mahadeo hills in the east. The peripheral part of Purna sub-catchment comprises of basaltic lava flows of the Deccan trap, whereas the central area is covered by the Purna alluvium. The Yerli sub-catchment largely exhibits flat topography, except the hilly regions in the north (Fig. 8.2C). The major land use type in Purna sub-catchment is agricultural land, which covers around 72% of the total land area (Loliyana and Patel, 2015). The Yerli sub-catchment displays semiarid climate, exhibiting mean aridity index (ratio of rainfall to potential evapotranspiration) of around 0.46 (Loliyana and Patel, 2018). Due to vast agricultural fields and high temperature regimes, the sub-catchment experiences higher evapotranspiration rates (Sharma et al., 2018b; Loliyana and Patel, 2018). Similar to the Tapi Basin, the Purna sub-catchment also receives majority of rainfall during southwest monsoon season, i.e., from mid-June to mid-October (Sharma et al., 2018a). The average annual rainfall is 830 mm in the Tapi Basin (Jain et al., 2007) and 817.5 mm in the Purna sub-catchment. The basin receives occasional rains during the post-monsoon period also (October and November); the rainfall in rest of the period is negligible. It is seen that hilly mountainous regions in the north receives relatively high rainfall, whereas rest of the Yerli sub-catchment receives a moderate rainfall (Fig. 8.3A). The streamflow at Yerli station exhibits high intraannual variability, wherein the highest streamflow is recorded in August followed by September (Fig. 8.3B). The Purna sub-catchment generates relatively lower streamflow vis-a`-vis Burhanpur sub-catchment, due to existence of flat topography, alluvial plains, and higher rates of evapotranspiration and deep percolation losses in the former sub-catchment (Sharma et al., 2018b).

8.4.2 Methodology The methodology adopted for streamflow forecasting using MT approach is presented in Fig. 8.4. The selection of input parameters plays a vital role in successful development and implementation of streamflow forecasting model using DDMs (Jothiprakash and Kote, 2011b). Thus, two different statistical methods, viz., the autocorrelation function (ACF) and the partial autocorrelation function (PACF), are employed to determine the optimum number of parameters corresponding to different antecedent streamflow values (Jothiprakash and Kote, 2011b). The average mutual information (AMI) plot, which estimates the mutual dependence between two random variables, is used to determine the time lag (s) corresponding to the first minimum AMI value (Wallot and Mønster, 2018). The value of s is, thereafter, employed to determine the minimum number of embedding dimensions to represent a

222 Advances in Streamflow Forecasting

FIGURE 8.3 (A) Spatial variability of rainfall across Yerli sub-catchment and (B) temporal streamflow variability at Yerli stream gauging station.

hydrological process using the false nearest neighbor (FNN) technique. The cross-correlation function (CCF) is estimated to determine the strength of linear relationship between two variables. Thus, various model configurations need to be developed considering combinations of selected input (predictor) variables, based on aforesaid techniques.

Model tree technique for streamflow forecasting Chapter | 8

Observed daily rainfall

223

Computed base flow using digital filter algorithm

Observed daily streamflow

Determination of model input structure based upon correlation, average mutual information (AMI) and false nearest neighbour (FNN) criterion

Formulate different candidate model configurations Input variable combinations: A, B, C, D, E and F

Model tree variants: PS, PUS, UPS and UPUS

Data proportioning: M1, M2 and M3

Compute the statistical performance indices for all candidate model configurations during the calibration and validation stages

Selection of best-fit model for streamflow forecasting at Yerli stream gauging station

FIGURE 8.4 Methodology adopted for streamflow forecasting using model tree technique.

Several statistical performance evaluation measures have been proposed by Moriasi et al. (2007) and Karran et al. (2014) for evaluation of model performances. In the present case study, the model performances are evaluated using root mean squared error (RMSE), mean absolute error (MAE), coefficient of determination (R2), and NasheSutcliffe efficiency (NSE), which are described below: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Pn 2 i¼1 ðQobs  Qsim Þ RMSE ¼ (8.2) n  Pn   i¼1 Qobsi  Qsimi MAE ¼ (8.3) n 0 12     Pn B C i¼1 Qobsi  Qobsi $ Qsimi  Qsimi B C B 2 C R ¼ Bsffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (8.4) C     2 2 B P A P @ n n $ i¼1 Qsimi  Qsimi i¼1 Qobsi  Qobsi

224 Advances in Streamflow Forecasting

9 8 > >   P > 2 > = < n Q Q obsi simi i¼1 NSE ¼ 1   2 > > Pn > > ; : i¼1 Qobsi  Qobsi

(8.5)

where Qobs and Qsim indicate observed and simulated streamflows, respectively, Qobs and Qsim denote mean observed and simulated streamflows, respectively, and n is the total number of observations.

8.5 Results and analysis 8.5.1 Selection of input variables The daily monsoon streamflow series at Yerli station for the period 1973e2013 is analyzed to derive the internal structure of streamflow series and its association with other hydroclimatic parameters. The ACF, PACF, AMI, and FNN plots for streamflow at Yerli station are shown in Fig. 8.5. The ACF of the

FIGURE 8.5 Determination of model input structure by analyzing (A) autocorrelation, (B) partial autocorrelation, (C) average mutual information, and (D) false nearest neighbor, for daily monsoon streamflow series at Yerli station for period 1973e2013.

Model tree technique for streamflow forecasting Chapter | 8

225

streamflow series showed a rapid exponential decay (Fig. 8.5A). However, the ACF did not cross the positive axis and dampened near to zero value at 20 lags. On the other hand, the PACF crossed the positive axis after t ¼ 1 (lag) and then oscillated back to positive value and continued to oscillate and dampen with increase in time lags (Fig. 8.5B). The AMI value is found to decrease from 0.095 at s ¼ 0 to a minimum value at s ¼ 7, and then it rises again (Fig. 8.5C). By adopting this s value (s ¼ 7) as delay time in the FNN analysis, the minimum embedding dimensions are worked out to be 8 (Fig. 8.5D). The lumped rainfall exhibited a poor correlation with the observed streamflow after lag one (Fig. 8.6A). The poor correlation between the lumped rainfall and streamflow could be due to lower value of runoff coefficient reported for the Yerli sub-catchment (Sharma et al., 2018b). In the present study, the baseflow is computed from observed streamflow using digital filter algorithm as proposed by Eckhardt (2005). Value of the CCF between observed streamflow and computed baseflow is found to be high for Yerli station, which exhibits a gradual decay (Fig. 8.6B). This is due to underlying porous aquifers, which are extensively present in the Yerli subcatchment and yield in higher value of the baseflow index (Sharma et al., 2018b). The magnitude of correlation is found to decrease with increase in the window length of the average streamflow. However, such reduction was found to be inconsequential (Fig. 8.6C and D). Thus, it is observed that the antecedent baseflow and moving average streamflow contains more information to represent streamflow characteristics compared to the lumped rainfall.

8.5.2 Model configuration The whole streamflow time series of Yerli station was divided into two subseries following three different data proportions, viz., 60:40, 70:30, and 80:20, for calibration and validation of the model. The corresponding model configurations for the aforesaid three proportions are designated as M1, M2, and M3, respectively. The statistical properties of the streamflow data for the calibration and validation periods are presented in Table 8.1. Here, the number of instances refers to the number of data points in a given dataset. It is observed that the number of instances during calibration (validation) period is observed to vary from 3050 (1952) to 3904 (1098). The M1, M2, and M3 datasets, configured due to partitioning of entire data into calibration and validation datasets, exhibited minimal variability with more similarity. From the correlation and FNN analyses, six different model configurations are suggested using combination of input variables. The number of input parameters in aforesaid proposed model configurations is found to vary from 3 to 8 (Table 8.2).

226 Advances in Streamflow Forecasting FIGURE 8.6 Cross-correlation of observed streamflow at Yerli with (A) lumped rainfall up to Yerli, (B) baseflow at Yerli, (C) QMOV3, (D) QMOV5, and (E) QMOV7 for Yerli station.

Strategy

Stage

Period

Number of instances

Mean (m3/s)

SD

CV

Maximum (m3/s)

Sk

M1 (60:40)

Calibration

1973e1997

3,050

191.5

489.9

2.6

10,380.0

7.7

4.1

Validation

1998e2013

1,952

145.6

403.3

2.8

8,703.0

8.6

17.6

M2 (70:30)

Calibration

1973e2001

3,538

183.9

471.5

2.6

10,380.0

7.7

4.2

Validation

2002e2013

1,464

148.7

425.0

2.9

8,703.0

8.8

21.9

M3 (80:20)

Calibration

1973e2004

3,904

176.2

461.2

2.6

10,380.0

7.7

5.2

Validation

2005e2013

1,098

164.3

449.2

2.7

8,703.0

9.2

24.3

CV, coefficient of variation; SD, standard deviation; Sk, skewness.

Percentage of zero (%)

Model tree technique for streamflow forecasting Chapter | 8

TABLE 8.1 Statistical properties for calibration and validation datasets at Yerli station.

227

228 Advances in Streamflow Forecasting

TABLE 8.2 Model configurations for streamflow prediction at Yerli station. Submodel name

No. of input variables

Input variables

A

3

ðQt1 ; Qt2 ; Qt3 Þ

B

3

ðRt1 ; Rt2 ; Rt3 Þ

C

4

ðQt1 ; Qt2 ; Rt1 ; Rt2 Þ

D

6

ðQt1 ; Qt2 ; Rt1 ; Rt2 ; BFt1 ; BFt2 Þ

E

7

ðQt1 ; Qt2 ; Rt1 ; BFt1 ; BFt2 ; QMOV3 ; QMOV5 Þ

F

8

ðQt1 ; Qt2 ; Rt1 ; BFt1 ; BFt2 ; QMOV3 ; QMOV5 ; QMOV7 Þ

Here, Qtk refers to the antecedent streamflow at Yerli station, Rtk is the antecedent lumped rainfall up to Yerli, BFtk is the computed antecedent baseflow at Yerli station, while QMOV3, QMOV5, and QMOV7 indicate 3-, 5-, and 7-day moving mean streamflow (at lag one) at Yerli station, respectively. Here, k ¼ 1, 2, 3.

8.5.3 Model calibration and validation The MT technique for streamflow modeling is performed using WEKA software (version 3.8), which is developed by the University of Waikato, New Zealand (Witten and Frank, 2005). The M5P algorithm is selected from the classifier tool to classify the instances (Quinlan, 1992). In total, 72 candidate models are formulated for the Yerli station considering combinations of 3 data proportioning configurations (i.e., M1, M2, and M3)  6 input variable configurations (viz., A, B, C, D, E, and F)  4 MT variants (i.e., pruned and smoothed [PS], unpruned and smoothed [UPS], pruned and unsmoothed [PUS], and unpruned and unsmoothed [UPUS]). Here, the submodel configuration “A” represents a time series model, where antecedent observed streamflows at Yerli station are used as predictor variables. The submodel configuration “B” demonstrates a cause-effect model, where antecedent lumped rainfall of Yerli sub-catchment is used to predict the streamflow at Yerli station. The submodel configurations “C,” “D,” “E,” and “F” are combined models, which have considered antecedent streamflow, lumped rainfall, computed baseflow, and streamflow averaged over window length K (K ¼ 3, 5, and 7) as predictor variables. Comparative performance of the outputs of all candidate models are evaluated using RMSE, MAE, R2, and NSE criteria.

8.5.4 Sensitivity analysis of model configurations towards model performance Sensitivity of the model configurations involving different combinations of data proportioning, input variables, and MT variants to model outputs is analyzed and discussed in the following subsections.

Model tree technique for streamflow forecasting Chapter | 8

229

8.5.4.1 Influence of input variable combinations The results of performance evaluation criteria for streamflow forecasting through different submodel configurations at Yerli station are shown in Fig. 8.7. It is seen that the “A,” “B,” “C,” and “D” models showed poor performance as compared to the “E” and “F” models. The “A” and “B” models displayed a higher variability for all the performance evaluation criteria, whereas those for the “E” and “F” models showed the least variability. Moreover, the “A” and “B” models seldom showed negative NSE values (Fig. 8.7D), rendering them unacceptable for the streamflow forecasting. The “E” and “F” models incorporates information regarding antecedent streamflow, lumped rainfall, computed baseflow, and moving average streamflow, which differ from other models by incorporating additional information about the K-days moving average streamflows (QMOVK). This shows that the QMOVK parameter effectively captures the streamflow variability in addition to other parameters. Thus, the candidate models resulting from the combination of “E” and “F” models with other configurations can be explored further to select the suitable model for streamflow forecasting at Yerli station.

FIGURE 8.7 Comparative assessment of the influence of input parameter selection on model performance at Yerli station. (a) root mean squared error, (b) coefficient of determination, (c) mean absolute error, (d) Nash-Sutcliffe efficiency. The blue and red box plots correspond to the calibration and validation stages, respectively.

230 Advances in Streamflow Forecasting

8.5.4.2 Influence of model tree variants The pruning and smoothing operations play a significant role in governing the model structure in MT technique. Hence, the influence of MT variants, viz., PS, PUS, UPS and UPUS, on performance of the models is investigated for all submodel configurations (A, B, C, D, E, and F) and data proportioning configurations (M1, M2, and M3). Here, performance for all submodel configurations is evaluated, rather than evaluating the performance of only the chosen submodel configurations (in the previous step) to analyze the overall sensitivity of MT variants on the model performance. It is observed that UPUS models exhibited the superior performance over other MT variants during calibration period in terms of lesser values of the RMSE and MAE criteria and higher values of the R2 and NSE criteria (Fig. 8.8). The performance of MT variants, other than UPUS, shows minimal variations during calibration period. However, the UPS models showed a better performance than the PS and PUS models. The UPUS models showed a wide variability among their performance during calibration and validation periods (Fig. 8.8), though their median values during validation period are comparable with those of other MT variants. Thus, UPS and UPUS models, in conjunction with submodels “E” and “F,” performed better than other variants and submodels, and hence, these better-performing model configurations can be further used.

FIGURE 8.8 Comparative assessment of the influence of model tree variants on model performance at Yerli station. (a) root mean squared error, (b) coefficient of determination, (c) mean absolute error, (d) Nash-Sutcliffe efficiency.

Model tree technique for streamflow forecasting Chapter | 8

231

8.5.4.3 Influence of data proportioning The sensitivity of varying proportions of calibration and validation datasets on streamflow forecasting is also assessed. The effect of three data proportioning configurations, viz., 60:40 (M1), 70:30 (M2), and 80:20 (M3), for calibration and validation datasets on model performance is analyzed for all the submodel configurations (A, B, C, D, E, and F) and MT variants (PS, PUS, UPS, and UPUS) to derive overall sensitivity. The variations in the performance evaluation criteria revealed smaller deviations among M1, M2, and M3 models during calibration period (Fig. 8.9). On the other hand, the model performance during validation period exhibited larger variability visa`-vis calibration period, particularly for the RMSE, R2, and NSE, although they were insignificant (Fig. 8.9). The higher variability observed in model performances during validation period vis-a`-vis calibration period is attributed to the higher variance and skewness in the former owing to greater proportion of zero values of streamflows (Table 8.1). Therefore, the resulting combinations of UPS and UPUS models with submodels “E” and “F” for all three data proportioning configurations (M1, M2, and M3) would be screened to select the best-fit model for forecasting streamflows at the Yerli station.

FIGURE 8.9 Comparative assessment of the influence of data proportioning on model performance at Yerli station. (a) root mean squared error, (b) coefficient of determination, (c) mean absolute error, (d) Nash-Sutcliffe efficiency.

232 Advances in Streamflow Forecasting

8.5.5 Selection of best-fit model for streamflow forecasting The results of performance evaluation criteria for calibration and validation periods for all the candidate models indicated that the submodel configurations “E” and “F” performed better and provided the most accurate forecasts of the streamflow at Yerli station. Hence, all other submodel configurations (viz., “A,” “B,” “C,” and “D”) except “E” and “F” are eliminated from the further considerations. The UPS models, invariably, performed better than other MT variants for Yerli station. Hence, model combinations with UPS and submodel configurations “E” and “F” are carried forward for further analyses. It is also found earlier that the M1, M2, and M3 models had comparable variations, and hence, aforesaid model combinations can be formulated in conjunction with M1, M2, and M3 for further assessment. The performance evaluation criteria further suggested that M1E UPS model showed better accuracy in predicting the peak streamflow instances vis-a`-vis other models. Therefore, M1E UPS was chosen as the best-fit model for streamflow forecasting at Yerli station. The chosen M1E UPS model at Yerli station comprised of 683 rules, which was calibrated and validated with 60% and 40% data, respectively. The performance evaluation criteria estimated for M1E UPS model yielded the RMSE values of 111.9 m3/s and 165.6 m3/s during calibration and validation periods, respectively. The MAE values are found to be same during model calibration and validation stages, which indicate a better prediction of moderate streamflow values. The model exhibits better predictability with resulting R2 and NSE values of 0.94 (0.83) and 0.95 (0.83) during calibration (validation), respectively. The time series plots of observed and predicted streamflows using M1E UPS model, for both calibration and validation stages, are shown in Fig. 8.10. The M1E model amalgamates information about antecedent streamflows, lumped rainfall, computed baseflow, and moving average streamflow at the station to predict 1-day ahead streamflow. The model predictions showed a better performance during calibration and validation stages; however, certain deviations of high streamflows are observed during validation period (Fig. 8.10B and D). From the data analysis, it is observed that the Yerli station across the Purna River does not exhibit high flood regimes due to the intermittent nature of streamflows. Furthermore, owing to semiarid climate, flat topography, and agricultural dominant lands in the Purna sub-catchment, the streamflow generation from this catchment is also comparatively less. Hence, streamflow modeling is a challenging task for such catchments exhibiting varied hydroclimatic and physiographic characteristics. However, overall prediction accuracy of the model is found to be satisfactory as per the criteria laid by Moriasi et al. (2007), especially for moderate flows, and can be employed for forecasting of daily streamflows at Yerli station by incorporating real-time information of hydroclimatic variables. This streamflow forecasting model would be helpful in water resources management for a water scarce semiarid Yerli sub-catchment.

Model tree technique for streamflow forecasting Chapter | 8

233

FIGURE 8.10 Temporal variability of observed and simulated streamflow during (A) calibration and (B) validation stage of the best-fit model chosen for streamflow forecasting at Yerli station.

234 Advances in Streamflow Forecasting

8.6 Summary and conclusions This chapter discussed about the application of DDMs in hydrological studies. The MT technique, which is based on decision trees and approximates piecewise linearization of nonlinear hydrological processes, has been discussed. The applicability of MT in forecasting 1-day ahead streamflow is demonstrated through a case study. The MT approach offered specific advantages of ease of development, less time consumption and data-intensive, low computational requirements, and easy interpretable results. From the literature, it was perceived that MT exhibits comparable performance to other DDMs such as ANN, fuzzy logic, support vector machine, etc. Unlike other DDMs, the outputs of MT in the form of linear regression equations are easy to interpret and can be applied in practice without specific training. From the case study, it has been found that the selection of predictor variables in modeling streamflow played a vital role in capturing the catchment memory, and the model performances were more sensitive to the combinations of input variables. The division of dataset into different proportions showed a marginal influence on the model performance. However, the MT variations, such as pruning and smoothing operations, showed a considerable impact on model performance. Overall, UPS models showed a marginal difference over other models. The chosen model for streamflow prediction at Yerli station exhibited the better prediction accuracy for moderate streamflows with mean average error of 43.3 and 44.4 m3/s for calibration and validation, respectively, visa`-vis high streamflows with RMSE of 111.9 and 165.6 m3/s for calibration and validation, respectively. However, the hybrid-MT models, which involve data preprocessing prior to input into the model, can be explored for improvement in prediction accuracy up to several lead times and at finer temporal scales.

Acknowledgments The first author gratefully acknowledges the financial support received from the Department of Science and Technology (DST), Ministry of Science and Technology, Government of India vide their letter no. DST/INSPIRE Fellowship/2015/IF150634 dated January 11, 2016. The authors are thankful to Center of Excellence (CoE) on “Water Resources and Flood Management,” TEQIP-II, Ministry of Human Resources Development (MHRD), and INCCC sponsored research project titled “Impact of Climate Change on Water Resources of Tapi Basin,” Ministry of Jal Shakti (MoJS), Government of India, for providing necessary infrastructural support for conducting the present study. The authors express sincere thanks to India Meteorological Department (IMD), Pune, and Central Water Commission (CWC), Surat, for providing necessary data for the study reported in the paper. The authors express sincere gratitude to the Reviewer for providing valuable suggestions and corrections in improving the quality of the manuscript.

References Ajmera, T.K., Goyal, M.K., 2012. Development of stageedischarge rating curve using model tree and neural networks: an application to Peachtree Creek in Atlanta. Expert Syst. Appl. 39 (5), 5702e5710. https://doi.org/10.1016/j.eswa.2011.11.101. Arunkumar, R., Jothiprakash, V., 2012. Reservoir evaporation prediction using dataedriven techniques. J. Hydrol. Eng. ASCE 18 (1), 40e49. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000597.

Model tree technique for streamflow forecasting Chapter | 8

235

Bourdin, D.R., Fleming, S.W., Stull, R.B., 2012. Streamflow modelling: a primer on applications, approaches and challenges. Atmos. Ocean 50 (4), 507e536. https://doi.org/10.1080/ 07055900.2012.734276. Eckhardt, K., 2005. How to construct recursive digital filters for baseflow separation. Hydrol. Process. 19 (2), 507e515. https://doi.org/10.1002/hyp.5675. Esmaeilzadeh, B., Sattari, M.T., Samadianfard, S., 2017. Performance evaluation of ANNs and an M5 model tree in Sattarkhan Reservoir inflow prediction. ISH J. Hydraul. Eng. 23 (3), 283e292. https://doi.org/10.1080/09715010.2017.1308277. Galelli, S., Castelletti, A., 2013. Assessing the predictive capability of randomized treebased ensembles in streamflow modelling. Hydrol. Earth Syst. Sci. 17 (7), 2669e2684. https:// doi.org/10.5194/hess-17-2669-2013. Garg, V., Jothiprakash, V., 2013. Evaluation of reservoir sedimentation using data driven techniques. Appl. Soft Comput. 13 (8), 3567e3581. https://doi.org/10.1016/j.asoc.2013.04.019. Goyal, M.K., 2014. Modeling of sediment yield prediction using M5 model tree algorithm and wavelet regression. Water Resour. Manag. 28 (7), 1991e2003. https://doi.org/10.1007/s11269014-0590-6. Han, D., Kwong, T., Li, S., 2007. Uncertainties in realtime flood forecasting with neural networks. Hydrol. Process. 21 (2), 223e228. https://doi.org/10.1002/hyp.6184. Iorgulescu, I., Beven, K.J., 2004. Nonparametric direct mapping of rainfallrunoff relationships: an alternative approach to data analysis and modeling? Water Resour. Res. 40 (8), W08403. https://doi.org/10.1029/2004WR003094. Jain, S.K., Agarwal, P.K., Singh, V.P., 2007. Hydrology and Water Resources of India, vol. 57. Springer-Verlag, Heidelberg, p. 567. https://doi.org/10.1007/1-4020-5180-8. Jothiprakash, V., Kote, A.S., 2011a. Improving the performance of datadriven techniques through data pre-processing for modelling daily reservoir inflow. Hydrol. Sci. J. 56 (1), 168e186. https://doi.org/10.1080/02626667.2010.546358. Jothiprakash, V., Kote, A.S., 2011b. Effect of pruning and smoothing while using M5 model tree technique for reservoir inflow prediction. J. Hydrol. Eng. ASCE 16 (7), 563e574. https:// doi.org/10.1061/(ASCE)HE.1943-5584.0000342. Karran, D.J., Morin, E., Adamowski, J., 2014. Multiestep streamflow forecasting using datadriven nonelinear methods in contrasting climate regimes. J. Hydroinf. 16 (3), 671e689. https://doi.org/10.2166/hydro.2013.042. Keshtegar, B., Kisi, O., Zounemat-Kermani, M., 2019. Polynomial chaos expansion and response surface method for nonlinear modelling of reference evapotranspiration. Hydrol. Sci. J. 64 (6), 720e730. https://doi.org/10.1080/02626667.2019.1601727. Kompare, B., Steinman, F., Cerar, U., Dzeroski, S., 1997. Prediction of rainfall runoff from catchment by intelligent data analysis with machine learning tools within the artificial intelligence tools. Acta Hydrotech. 16 (17), 79e94. https://doi.org/10.1007/s12040-008-0005-2. Loliyana, V.D., Patel, P.L., 2015. Lumped conceptual hydrological model for Purna River Basin, India. Sadhana 40 (8), 2411e2428. https://doi.org/10.1007/s12046-015-0407-1. Loliyana, V.D., Patel, P.L., 2018. Performance evaluation and parameters sensitivity of a distributed hydrological model for a semi-arid catchment in India. J. Earth Syst. Sci. 127 (8), 117. https://doi.org/10.1007/s12040-018-1021-5. More, D., Magar, R.B., Jothiprakash, V., 2019. Intermittent reservoir daily inflow prediction using stochastic and model tree techniques. J. Inst. Eng. India A 1e8. https://doi.org/10.1007/ s40030-019-00368-w.

236 Advances in Streamflow Forecasting Moriasi, D.N., Arnold, J.G., Van Liew, M.W., Bingner, R.L., Harmel, R.D., Veith, T.L., 2007. Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. T. ASABE 50 (3), 885e900. https://doi.org/10.13031/2013.23153. Nalarajan, N.A., Mohandas, C., 2015. Groundwater level prediction using M5 model trees. J. Inst. Eng. India A 96 (1), 57e62. https://doi.org/10.1007/s40030-014-0093-8. Nourani, V., Davanlou Tajbakhsh, A., Molajou, A., Gokcekus, H., 2019. Hybrid WaveleteM5 Model tree for rainfallerunoff modeling. J. Hydrol. Eng. ASCE 24 (5), 04019012. https:// doi.org/10.1061/(ASCE)HE.1943-5584.0001777. Oyebode, O., Otieno, F., Adeyemo, J., 2014. Review of three dataedriven modelling techniques for hydrological modelling and forecasting. Fresenius Environ. Bull. 23 (7), 1443e1454. Perrin, C., Michel, C., Andre´assian, V., 2001. Does a large number of parameters enhance model performance? Comparative assessment of common catchment model structures on 429 catchments. J. Hydrol. 242 (3e4), 275e301. https://doi.org/10.1016/S0022-1694(00)00393-0. Quinlan, J.R., 1992. Learning with continuous classes. In: Adams, A., Sterling, L. (Eds.), Proceedings of 5th Australian Joint Conference on Artificial Intelligence, Hobart, Tasmania, November 16e18, 1992, vol. 92, pp. 343e348. Rezaie-Balf, M., Kim, S., Fallah, H., Alaghmand, S., 2019. Daily river flow forecasting using ensemble empirical mode decomposition based heuristic regression models: application on the perennial rivers in Iran and South Korea. J. Hydrol. 572, 470e485. https://doi.org/10.1016/ j.jhydrol.2019.03.046. Rezaie-Balf, M., Zahmatkesh, Z., Kim, S., 2017. Soft computing techniques for rainfallrunoff simulation: local noneparametric paradigm vs. model classification methods. Water Resour. Manag. 31 (12), 3843e3865. https://doi.org/10.1007/s11269-017-1711-9. Sharma, P.J., Loliyana, V.D., Resmi, S.R., Timbadiya, P.V., Patel, P.L., 2018a. Spatiotemporal trends in extreme rainfall and temperature indices over Upper Tapi Basin, India. Theor. Appl. Climatol. 134 (3e4), 1329e1354. https://doi.org/10.1007/s00704-017-2343-y. Sharma, P.J., Patel, P.L., Jothiprakash, V., 2018b. Assessment of variability in runoff coefficients and their linkages with physiographic and climatic characteristics of two contrasting catchments. J. Water Clim. Change 10 (3), 464e483. https://doi.org/10.2166/wcc.2018.139. Sharma, P.J., Patel, P.L., Jothiprakash, V., 2019. Impact of rainfall variability and anthropogenic activities on streamflow changes and water stress conditions across Tapi Basin in India. Sci. Total Environ. 687, 885e897. https://doi.org/10.1016/j.scitotenv.2019.06.097. Solomatine, D.P., 2005. Dataedriven modeling and computational intelligence methods in hydrology. In: Anderson, M. (Ed.), Encyclopedia of Hydrological Sciences. Wiley, New York. https://doi.org/10.1002/0470848944.hsa021. Solomatine, D.P., Dulal, K.N., 2003. Model trees as an alternative to neural networks in rainfallerunoff modelling. Hydrol. Sci. J. 48 (3), 399e411. https://doi.org/10.1623/ hysj.48.3.399.45291. Solomatine, D.P., Ostfeld, A., 2008. Dataedriven modelling: some past experiences and new approaches. J. Hydroinf. 10 (1), 3e22. https://doi.org/10.2166/hydro.2008.015. Solomatine, D.P., Xue, Y., 2004. M5 model trees and neural networks: application to flood forecasting in the upper reach of the Huai River in China. J. Hydrol. Eng. ASCE 9 (6), 491e501. https://doi.org/10.1061/(ASCE)1084-0699(2004)9:6(491). Wallot, S., Mønster, D., 2018. Calculation of average mutual information (AMI) and falseenearest neighbors (FNN) for the estimation of embedding parameters of multidimensional time series in matlab. Front. Psychol. 9, 1679. https://doi.org/10.3389/fpsyg.2018.01679.

Model tree technique for streamflow forecasting Chapter | 8

237

Witten, I.H., Frank, E., 2005. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, San Francisco, USA. Yaseen, Z.M., Kisi, O., Demir, V., 2016. Enhancing longeterm streamflow forecasting and predicting using periodicity data component: application of artificial intelligence. Water Resour. Manag. 30 (12), 4125e4151. https://doi.org/10.1007/s11269-016-1408-5. Zia, H., Harris, N., Merrett, G., Rivers, M., 2015. Predicting discharge using a low complexity machine learning model. Comput. Electron. Agric. 118, 350e360. https://doi.org/10.1016/ j.compag.2015.09.012.

Chapter 9

Averaging multiclimate model prediction of streamflow in the machine learning paradigm Kevin O. Achieng1, 2 1 Department of Civil and Architectural Engineering, University of Wyoming, Laramie, WY, United States; 2Department of Crop & Soil Sciences, University of Georgia, Athens, GA, United States

9.1 Introduction Climate change poses imminent threat to availability of the surface water and groundwater, especially in the mid-21st century. Numerical modelsdin the form of both global and regional climate modelsdare becoming common tools to simulate hydrological variables like streamflow (Achieng and Zhu, 2019a; Chan et al., 2020; Chien et al., 2013; Takle et al., 2010). However, these climate models often produce dissimilar results due to the parametric differences among the different models, which include difference in model physics, relatively coarse spatial resolution especially among the global climate models (GCMs), and difference in the source input data that drive the climate models (Crosbie et al., 2011). The uncertainty in the climate models’ hydrological outputs is addressed by developing regional climate models (RCMs), which have relatively fine spatial resolutions (Gutowski et al., 2010; Mearns et al., 2012; Shrestha et al., 2012). However, the enhanced spatial resolution of the climate models with the advent of RCMs does not result in perfectly accurate estimates of hydrological parameters, and there still exits uncertainty in the simulated hydrological variables across the climate models. One approach for further reducing the uncertainty is selection of appropriate models or model selection where climate models that do not simulate hydrological variables closely are eliminated from a large group of climate models. However, model selection approach is sometimes disadvantageous as only the best model is retained and the rest are excluded regardless of the fact that the disqualified models may have equally cost in time and money involved in their development, and they may be better in simulating other aspects of hydrological variable. Hence, combinatory methods such as model averaging Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00010-X Copyright © 2021 Elsevier Inc. All rights reserved.

239

240 Advances in Streamflow Forecasting

in Bayesian and machine learning frameworks are increasingly being used to take advantage of the relative strengths of the RCMs where the hydrological simulations from multiple climate models are averaged (Achieng and Zhu, 2019a,b; Deo and Sahin, 2016; Shortridge et al., 2016). Total/overall streamflow, low streamflow, and high streamflow are important flow regimes for the effective water resources management. Low streamflow is associated with droughts, which is a prolonged shortage of moisture in a hydrological system due to high temperatures or low precipitation (Bae et al., 2019; Van Loon, 2015). Therefore, both temperatures and precipitation measurements are often used as input variables for prediction of drought indicators such as the seasonal standardized precipitation index (SPI) (Bae et al., 2019; McKee et al., 1993). High streamflow regimes are often associated with floods, which are mitigated nonstructurally based on flood forecasting (Jiang et al., 2016; Krajewski et al., 2017). River stage measurement is used as a proxy to flood forecasting. Different machine learning methods have been used to statistically average streamflow from multiple models (Deo and Sahin, 2016; Maity et al., 2010a; Tongal and Booij, 2018). The machine learning models that have been widely used in streamflow prediction and forecasting include artificial neural networks (ANNs), support vector regression (SVR) models, genetic programming, and random forest. However, this chapter pays special attention to SVR and ANN models as these are the two most commonly used machine learning methods in streamflow forecasting (e.g., Daliakopoulos and Tsanis, 2016; Honoratoda et al., 2018; Kis¸i, 2007; Maity et al., 2010a; Tongal et al., 2018).

9.2 Salient review on ANN and SVR modeling for streamflow forecasting Neural networks and SVR models are increasingly being used in streamflow studies. ANN or SVR can be conceptualized as a lumped parametric model consisting of a black-box with inputs (i.e., observed rainfall and streamflow) and outputs (i.e., predicted streamflow) (Dawson and Wilby, 2001). This is because neither ANN nor SVR requires us to understand the physical structure of hydrological process parameters within the hydrological system (e.g., a river basin). This is particularly useful in cases where we are solely interested in accurate model predictions. This section provides a salient review of application of these models in streamflow studies. Notice that there is no record of application of SVR in modeling streamflow in 1990s. Application of ANN and SVR models in streamflow forecasting gained popularity in the 2000s. Application of the widely used machine learning methods in streamflow modeling (i.e., ANNs and SVR) started in the 1990s. The first successful streamflow study to apply ANN was conducted in a river basin in Pisuena, Spain (Crespo and Mora, 1993). Eleven-year Pisuena River stage data were used as input to the ANN to predict drought with satisfactory accuracy.

Averaging multiclimate model Chapter | 9

241

The ANN model was applied in streamflow forecasting in the Pyung Chang River basin in South Korea (Kang et al., 1993), and results suggested superiority of the ANN model compared to classical statistical methods such as the autoregressive moving average (ARMA) model. Other earliest studies demonstrate that ANN producing plausible streamflow forecasts was a study by Karunanithi et al. (1994). They successfully applied ANN to predict streamflow of Huron River, Michigan, USA. ANN model and autoregressive model were found to produce comparably reliable prediction of monthly streamflow into two reservoir located in Bharathapuzha basin, India (Raman and Sunilkumar, 1995). Reliable ANN-based runoff prediction was evident in a study conducted in Araxisi watershed in Sardinia, Italy, using three decades of input dataset (Lorrai and Sechi, 1995). Hsu et al. (1995) noted that ANN is most appropriate in modeling runoff process if we do not need to understand either the physical meaning of model parameter or the internal structure of a hydrological system. They further noted that, in medium-size Leaf River basin (Mississippi), the ANN produced superior runoff prediction than either a time series model (e.g., autoregressive moving average with exogenous inputs [ARMAX]) or a conceptual model (e.g., Sacramento soil moisture accounting (SAC-SMA)). Climate change impact study was successfully conducted, using ANN, over 15 river basins in Canada’s Atlantic region using a 10-year period (i.e., 1983 to 1992) of input data (Clair and Ehrman, 1996). Evapotranspiration and temperature were found to play a vital role in the rainfall-runoff model and the associated solute transport. ANN has been used to provide plausible values of runoff coefficients for handling urban storm drainage (Loke et al., 1997). Satisfactory ANN-based real-time forecasting of streamflow, particularly flood forecasting, was obtained in a streamflow study by Thirumalaiah and Deo (1998). ANN was used to model runoff at a quarterly monthly time interval for a 20,000 km2 catchment in Winnipeg River basin, Canada. This comparative study shows that ANN produces better short-term streamflow forecasts than the conventional method (Zealand et al., 1999). Streamflow studies in the Winnipeg River system, Canada, suggested superior streamflow prediction when ANNs were used (Zealand et al., 1999). In the 2000s, number of studies employing machine learning models for streamflow forecasting continuously increased. In a study conducted in three river basinsdFraser River in Colorado, Raccoon Creek in Iowa, and Little Patuxent River in MarylanddANN was found to have shorter training time and superior monthly streamflow forecasts than both SAC-SMA model and simple conceptual rainfall-runoff model (Tokar and Markus, 2000). The three river basins were used because they embody different topographical and climate regimes. Streamflow forecasts using ANN were found to be more reliable when lead times are shorter than 7 days and less reliable for longer lead times such as 14, 21, and 28 days. This was based on a streamflow study conducted in

242 Advances in Streamflow Forecasting

Chao Phraya River, Thailand, for April 1978eMarch 1994 time period (Jayawardena and Fernando, 2001). One of the first studies to investigate the potential of SVR to model streamflow was conducted by Sivapragasam et al. (2001). They showed that SVR predicted streamflow, in a Danish catchment, with a higher accuracy than nonlinear model. In the same year, Dibike et al. (2001) demonstrated that SVR outperforms ANN in simulating runoff. A study conducted in River Yangtze, China, revealed that the ANN models produced reasonable streamflow predictions (Dawson et al., 2002). A study conducted in Italian Sieve River basin demonstrated that ANN does a better job than conceptual rainfall-runoff model in predicting runoff when the input data capture all flow regimes and a poor job if the major flood features are missing from the input dataset (i.e., historical streamflow) (Toth and Brath, 2002). In a comparative study conducted in Bangladesh, SVR model provided similar or better prediction (of four-lead day to seven-lead day maximum water level) than ANN (Liong and Sivapragasam, 2002). ANN has been applied with success in forecasting floods in Italian River Arnodwith errors of 7%e15% for lead times of 1e6 h, respectively (Campolo et al., 2003). Even though ANN struggled with predicting peak flows, the ANN was found to produce superior short-term daily runoff predictions than the Boxe Jenkins method (Castellano-Me´ndez et al., 2004). This was based on a study conducted in Xallas River basin in Galicia, Spain. In a study conducted in Narmada River basin, India, ANN produced superior runoff prediction than the rainfall-runoff linear transfer modeldeven with variable and uncertain datasets as input data to the two models (Agarwal and Singh, 2004). A study conducted in Bykmenderes basin (Turkey) compared runoff prediction by ANN model with the prediction of multilinear regression (MLR) (Cigizoglu and Alp, 2004). ANN was found to produce statistically comparable runoff prediction to those of MLR. Better daily streamflow prediction was obtained from ANN than linear regression model in a study conducted in a semiarid Ourika catchment in Morocco (Riad et al., 2004). Compared to autoregressive integrated moving average time series model, superior river flow prediction has been estimated with SVR in two case study riversdTryggevælde catchment, Denmark, and the Mississippi River, USA (Yu et al., 2004). An ANN-based comparative study conducted in Krishna River, India, revealed that even though ANN has been found to produce reliable streamflow prediction in many studies, the structure of ANN affects the accuracy of runoff prediction (Senthil Kumar et al., 2005). In this study, for example, a multilayer perceptronebased ANN (with a sigmoid transfer function) was found to be superior, at predicting streamflow, to a radial basis function (RBF)ebased neural networks. SVR forecasted spring and fall season streamflow with higher accuracy than linear discriminant analysis and multinomial logistic regression models in a study conducted in the Pacific Northwest region of the United States (She and Basketfield, 2005).

Averaging multiclimate model Chapter | 9

243

Antar et al. (2006) conducted a study to predict runoff in a 300,000 km2 Blue Nile River basin. They trained ANN model using 5-year (1992e96) dataset and tested the model using 3-year (1997e99) dataset. The results showed that, even though the ANN model produced superior runoff prediction than physically based distributed model, ANN had trouble predicting high flows due to the short training data record. Another comparative study, performed in Bulken and Skarsvatn river basins in Norway, showed that the ANNs outperformed the process-based streamflow models (Nilsson et al., 2006). The SVR models performed well in producing plausible seasonal and hourly streamflow predictions in Sevier River basin, Utah, USA (Asefa et al., 2006). Similarly, the SVR models outperformed the ANNs and ARMA models in predicting streamflow using January 1974 to December 2003 dataset of Manwan Reservoir in China (Jian et al., 2006). Yu et al. (2006) predicted hourly river stage associated with flash floods plausibly using rainfall and river stage as the input to the SVR models in Lan-Yang River, Taiwan. Flood stage forecasts was effectively predicted in Lan-Yang River, Taiwan, using SVR model (Yu et al., 2006). A comparative study of three ANN, to predict short-range and long-range daily streamflow in two Turkish rivers, revealed that the RBF-based ANN performs better than feedforward backpropagation and generalized regression neural networks (Kisi and Cigizoglu, 2007). The SVR models are modified and used for the superior real-time flood’s river stage forecasting in Lang-Yang River in Taiwan (Chen and Yu, 2007). Studies conducted in Bird Creek catchment of the United States revealed that the SVR models provide robust estimates of flood forecasts, even though the SVR models may suffer more from overfitting than underfitting (Han et al., 2007). Study conducted in small watershed of Tono area in Japan showed that ANN produce better runoff predictions than the traditional rainfall-runoff model, particularly in wet seasons, than conventional rainfall-runoff models (Sohail et al., 2008). Real-time flood forecast has been predicted with SVR (Yu et al., 2008). River flow at current time step (Qt) has been plausibly forecasted with ANN, using previous monthly river flows (Qt1, ., Qt5) as input, in Beyderesi and Kocasu rivers in Turkey (Partal, 2009). SVR outperforms ANN in predicting 1-day lead streamflow according to a study conducted in Bakhtiyari River in Iran (Behzad et al., 2009). The SVR models have also been found to provide plausible prediction of long-term monthly river flow discharges in Lancangjiang River, Southwest China (Wang et al., 2009). Runoff prediction from an Indian watershed, i.e., Vamsadhara River basin, showed that the SVR outperformed both the ANN model and the multiple regressive pattern recognition technique (Misra et al., 2009). Reasonable streamflow forecasts were obtained with ANN models in a study conducted in Narmada catchment of India (Londhe and Charhate, 2010). Studies conducted in data-limited regions, such as Blue Nile River in Sudan, have shown that the ANN provided plausible streamflow prediction (Shamseldin, 2010). A comparative study of ANN

244 Advances in Streamflow Forecasting

compared the ability of multiple ANN activation functions to predict discharge based on a 5-year record of data (Zadeh et al., 2010). Results of this study showed that daily discharge is better predicted with tangent sigmoid activation functionebased ANN. Better streamflow prediction has been achieved using SVR than the traditional BoxeJenkins approach (Maity et al., 2010). A one hidden layer ANN was used to predict monthly future discharge, using unreliable (short length and low standard) input dataset, for arid Hub River watershed of Pakistan. Current and previous monthly rainfall and discharge were used as ANN input. The ANN model was trained, validated, and tested with 10 years (i.e., 1963e73), 2 years (1974 and 1975), and 4 years (1976e79) of monthly data, respectively. This study demonstrated that ANN can produce reliable monthly river runoff predictions even when the data are scarce and scanty (Ghumman et al., 2011)dbetter than the conventional rainfall-runoff models. A study conducted in the Jangada River basin, Parana´, Brazil, showed that ANN model produced reliable monthly runoff predictions than conceptual model (Machado et al., 2011). A wavelet-SVR model was used to effectively forecast monthly streamflow of two Turkish riversdCanakdere River and Goksudere River (Kisi and Cimen, 2011). A study conducted in mountainous Sianji watershed (Himalayan region, India) simulated daily runoff using July 1, 2001 and June 30, 2004 dataset (Adamowski and Prasher, 2012). The results of this study indicated that SVR model produced accurate prediction of daily runoff. Compared to ANN, SVR was found to do a better job at modeling relationship between river stage, discharge, and sediment (Jain, 2012). In a drought study conducted in Iran, the SVR models predicted SPI values with the robust performance (Shahbazi et al., 2011). In a limited data availability condition in Sianji watershed located in the Himalayan region of India, the SVR models yielded reasonable forecasts of streamflow even though the modified ANN models slightly outperformed the SVR models (Adamowski and Prasher, 2012). In Peninsular Malaysia, the SVR provided more accurate predictions of streamflow than resulted by MLR models at ungauged sites using data of 88 water level stations (Zakaria and Shabri, 2012). Monthly river flow forecast has been found to be predicted better with SVR than ANN. This is based on a study conducted in two gauging stations in Northern Iran (Kalteh, 2013). Results of the same study suggested that ANN and SVR flow predictions were enhanced if the models were integrated with wavelet transform. In another study conducted in Linbien River basin, Taiwan, typhoon-driven runoff was predicted more accurately with ANN than regression models (Chen et al., 2013). In Awash River basin of Ethiopia, it is revealed that wavelet neural networks (WNN) predicted 3-month SPI comparatively better than that predicted by the SVR and ANN models (Belayneh and Adamowski, 2013). In a study conducted in Kentucky River, USA, and Kolar River, India, sequential ANN was found to provide reliable hourly streamflow forecast of up to 8 h in advance (Prakash et al., 2014). In another study, ANN produced reliable streamflow sediment prediction

Averaging multiclimate model Chapter | 9

245

irrespective of whether a complete dataset is used or if only a subset of the dataset, representing a critical hydrological event (i.e., pruned dataset), is used (Singh et al., 2014). Monthly streamflow prediction capability of SVR has been found to improve if the SVR is coupled with empirical mode decomposition (EMD). This was demonstrated in a streamflow forecast study conducted in Wei River basin, China (Huang et al., 2014). A comparative evaluation of the SVR and ANN models in streamflow modeling over Mediterranean, Oceanic, and Hemiboreal watersheds of Alexander Stream in Israel, the Koksilah River in British Columbia, Canada, and the Upper Bow River in Alberta, Canada, respectively, demonstrated that the performance of the SVR was superior in streamflow simulation as compared to that of the ANN models (Karran et al., 2014). The SVR models provided plausible estimates of the SPI-based drought forecasts in a study conducted over western Rajasthan in India (Ganguli and Janga Reddy, 2014). Plausible streamflow ANN-based prediction was obtained in Iranian Shoor Ghayen River (Rezaei et al., 2015). Acceptable 1-day ahead streamflow prediction was obtained in Krishna and Narmada River basins, India, using SVR model (Londhe and Gavraskar, 2015). Rainfall-runoff modeling performed for five rivers of the Lake Tana basin, Ethiopia, revealed that the ANN model predicted viable runoff estimates across the basin (Shortridge et al., 2016). Streamflow prediction in Tomebamba River with ANN was found to be useful in flood prediction for the Cuenca City, Ecuador (Veintimilla-Reyes et al., 2016). Compared to the commonly used Regional Flood Frequency Analysis method, SVR model has been found to provide reliably superior prediction of current and future floods over catchments in British Columbia (BC) and Ontario (ON), Canada (Gizaw and Gan, 2016). In a study conducted in Jinsha River, China, SVR model did a better job at predicting streamflow if the input data were first decomposed with either discrete wavelet transform or EMD (Zhu et al., 2016). According to a study conducted in Ajichai River basin, Iran, wavelet-ANN outperformed both conventional ANN and SVR models in predicting daily streamflow (Shafaei and Kisi, 2017). This is attributed to the ability of wavelet transform to eliminate redundant error in the historical streamflow data (Peng et al., 2017). A comparative study, conducted over upper Indus Basin, showed that SVR outperforms ANN in predicting monthly streamflow (Adnan et al., 2017). Similarly, the SVR and ANN models were employed for statistically comparable rainfall-runoff modeling in Chishan Creek basin in southern Taiwan (Young et al., 2017). Superior monthly streamflow forecasting, in the River Nile Basin, was obtained with ANN when the seasonality was first removed from the input data (Elganiny and Eldwer, 2018). With historical streamflow, precipitation, and temperature as input data, SVR model outperformed other machine learning models (e.g., fuzzy genetic algorithm and model tree techniques) in predicting daily streamflow of poorly gauged Hunza River, Pakistan (Adnan et al., 2018). A streamflow forecasting study conducted for four rivers in the United States

246 Advances in Streamflow Forecasting

(East Fork of the Carson River, Sacramento River, North Fork DamdNorth Fork River, and Chehalis River) concluded that coupling of the ANN and SVR models with physical-based models significantly improved the performance of both the ANN and SVR models (Tongal and Booij, 2018). Recently, in a study conducted in Australia, ANN-based runoff prediction has been found to improve if hydrogeomorphic and biophysical variables are incorporated in the ANN input layer (Asadi et al., 2019). SVR provides better monthly streamflow forecast of Shigu and Xiangjiaba, China, if the model is coupled with gray correlation analysis and seasonal-trend decomposition procedure based on loess (Luo et al., 2019). ANN provides superior prediction of 1e8-day-ahead streamflow compared to process-based model (e.g., Soil and Water Assessment Tool-Variable Source Area model) (Wagena et al., 2020). According to a study conducted in China’s Three Gorges Dam, decomposing input streamflow data with Fourier transform improves capability of SVR model to predict monthly streamflow into the dam (Yu et al., 2020).

9.3 Averaging streamflow predicted from multiclimate models in the neural network framework The basic framework of a neural network consists of the input layer, the hidden layer, and the output layer (Crespo and Mora, 1993). The neural networks comprise of computational elements called the neurons. These neurons are activated based on the activation functions. Selection of a particular activation function depends on the intended output of the neural networks. Probability problems (e.g., image classification) often use the sigmoid activation function whose output range from 0 to 1 (Sarajedini et al., 1999). However, modeling of streamflow requires the actual values of the streamflow, and hence, the activation function that is applied is the rectified linear unit (ReLU) (Hara et al., 2015). The neurons are contained in the hidden layer(s). The connections between the hidden layers are the weights. In addition to the weights, the output from a hidden layer is adjusted with a bias term. Therefore, in addition to the neurons present in a given hidden layer, a neuron for the bias term is included. It may be noted that a bias term must also be included in the input layer. A neuron in a current hidden layer is connected to all the neurons in the previous hidden layer, and thus, it is called fully connected neural networks. This chapter presents a case study ahead, where neural networks with one hidden layer called a single-layered (an artificial) neural networks are used. On the other hand, the neural network model that contains more than one hidden layer is called deep neural network (DNN) model (Achieng, 2019a; Veres et al., 2015). In the case study presented ahead, streamflow from 10 climate models is averaged in the neural networks framework. Therefore, the input layer of the ANNs/DNNs contains streamflow from 10 RCMs as shown in Fig. 9.1. The output layer contains the averaged streamflow.

Averaging multiclimate model Chapter | 9

247

FIGURE 9.1 The neural network architecture for averaging streamflow from 10 regional climate models.

Computationally, the neural networks model uses a gradient-descent algorithm to minimize a mean squared error between the observed and the modeled streamflow (Aqil et al., 2007; Venkata Ramana et al., 2013). The objective function is a function of the weights and the bias terms, and it is often called the cost function (Holt and Semnani, 1990). For every step of the gradient descent, a given weight term is updated by subtracting the derivative of the cost function with respect to the weight from the current value of the weight. The bias term is also updated in a similar manner. For every time step, all the weights and bias terms are updated. A threshold error value is set up to help us know when the required number of steps has been taken to arrive at the optimum point of the cost function. The process of updating the weight and bias terms, as we take small steps downhill during the gradient descent, to get their optimum values is called training of the neural network (Chow and Cho, 2007; Graupe, 2013). Comprehensive information about the neural networks is already well established in the literature (e.g., Tongal and Booij, 2018) and one other chapter of this book, and hence, the same is briefly discussed in this section. A step-by-step detail of how to average streamflow from multiple climate models within neural networks framework is provided in Fig. 9.2A.

248 Advances in Streamflow Forecasting

FIGURE 9.2 Steps for averaging streamflow from multiple climate models with (A) ANN/DNN and (B) SVR. RBF is radial basis function. C is the regularization parameter (error complexity trade-off of the SVR model). The “gamma” is the SVR kernel coefficient. ꜫ is the precision/accuracy term (SVR threshold error). ANN, artificial neural network; DNN, deep neural network; SVR, support vector regression.

9.4 Averaging streamflow predicted by multiclimate models in the framework of support vector regression The SVR is a subset of support vector machines (Scho¨lkopf et al., 1997) that deals with solving regression problems. Like the neural networkebased streamflow averaging model, the input data that are pumped into to the SVRbased streamflow averaging model consist of climatic projections of the 10 RCMs. Unlike the neural networks, only a subset of the training dataset is used to fit the SVR model. This subset of training dataset is called support vector (Smola et al., 2004). An error margin, or error tube, is used to define the error threshold. The data points whose deviations are greater than or equal to the error margin are used as the support vector. The support vector is used to develop the SVR model by minimizing the squared error between the observed streamflow and the averaged streamflow. Unlike the neural networks, the SVR

Averaging multiclimate model Chapter | 9

249

objective function is constrained in nature. Therefore, Lagrange multipliers are used to combine the objective function and the constraints into a Lagrangean function, which is then solved through a numerical method called quadratic programming (Frank and Wolfe, 1956; Goldfarb and Idnani, 1983; Gould and Toint, 2002). Solution to the SVR Lagrangean function is often difficult, especially when the input data are multidimensional as it may require mapping the data from the current space into a higher-dimensional space before solving the constrained objective function. Derivation of the mapping function is often difficult (Tongal and Booij, 2018), and therefore, kernels are commonly used instead of these mapping functions. In the case study presented ahead, the plausibility of three commonly used SVR kernels, to model streamflow, is investigated. The three SVR kernels used in study include (i) the linear, (ii) the quadratic, and (iii) the RBF kernels. The main SVR model parameter is the trade-off parameter (i.e., the C value), which provides a balance between the model complexity and the model fit. Besides the C value, an SVR kernel like the RBF has a kernel parameter called the gamma, which also needs to be optimized. More information about the SVR models can be found in the literature (e.g., Noori et al., 2015; Scho¨lkopf et al., 1997; Zhou et al., 2017). Implementation steps for averaging climate models’ streamflow within SVR framework are provided in Fig. 9.2B.

9.5 Machine learningeaveraged streamflow from multiple climate models: two case studies In this section, applicability of the machine learning models as a framework for averaging streamflow from multiple climate models is demonstrated through two case studies from the State of Minnesota, United States, where 10 neural network models (one ANN model and nine DNN models) were developed. Besides the 10 neural network models, three widely used SVR models were also developed, which are (i) RBF, (ii) quadratic, and (iii) linear kernels. These models were trained and tested using observed streamflow datasets (1968e98) from two basins, and details of the studies are discussed ahead as two case studies. The two case studies were performed in the Cedar River basin near Austin and the Rainy River basin at Manitou Rapids; both the basins are located in the State of Minnesota of the United States. These basins were selected because of readily available continuous streamflow data and because of familiarity with both the basins. The Cedar Basin is relatively smaller having drainage area of 1500 km2 than the Rainy Basin having drainage area of 30,000 km2. Hence, the Rainy Basin produces the mean annual streamflow of 387 m3/s, which is much larger than 8 m3/s of the mean annual streamflow generated from the Cedar Basin. However, the Cedar Basin received relatively more mean annual rainfall (834 mm) than that is received by the Rainy Basin (671 mm) during 31-year period (1968e98).

250 Advances in Streamflow Forecasting

The observed streamflow datasets for 31-year (1968e98) period were obtained from the US Geological Survey (USGS), whereas the climate models’ streamflow datasets, for the two study basins, were obtained from Achieng et al. (2019a). Cedar River basin’s gauging station is situated at [92.97 degrees, 43.64 degrees], whereas the location of Rainy River basin’s gauging station is at [95.54 degrees, 44.72 degrees]. The climate models datasets are managed by North American Regional Climate Change Assessment Program (NARCCAP) (Mearns et al., 2012; NARCCAP, 2007). The NARCCAP manages the RCMs datasets which have been gridded at 50-km spatial resolution with a temporal resolution of 8 h. Daily streamflow data for 31 years (1968e98) were extracted for the 10 RCMs, which are mentioned in Fig. 9.1. NARCCAP RCMs have land surface models that partition precipitation into various hydrological variables such as surface and subsurface runoff. A combination of surface and subsurface runoff was used as streamflow, which is consistent with other studies (Achieng and Zhu, 2019a). To capture all flow regimes, 0 to 100th percentiles of the daily streamflow dataset were used for analysis, i.e., training and testing of the machine learning models. These datasets were randomly split into training and testing proportions in the ratio of 0.8 and 0.2, respectively. Thus, first 29-year (1968e96) streamflow data were used for the model training and the remaining 2-year (1997e98) data were used for model testing purpose. The optimal configuration of the neural network models, for averaging the RCM-predicted streamflow, was found to have 150 neurons in one hidden layer with the ReLU activation function and trained with 100,000 iterations. On the other hand, the optimal value of error complexity trade-off (C) for the SVR models was 1000. The optimal gamma value for the RBF kernel was 0.10. During the training phase, both the ANN and DNN models perfectly predicted the averaged streamflow for both the Cedar and Rainy basins, and the same is seen from Figs. 9.3A and 9.4A, respectively. The excellent performance of both the ANN and DNN models in predicting streamflow values is further supported from the values of five performance evaluation criteria, i.e., NasheSutcliffe efficiency (NSE) > 99%, percent bias (PBIAS) < 1%, coefficient of determination (R2) ¼ 1, root mean square error (RMSE) < 0.02 mm/day, and Willmott degree of agreement (d1) > 0.99, for both the Cedar and Rainy basins as shown in Tables 9.1 and 9.2, respectively. Similar to training phase, the trained ANN and DNN models showed a good performance during the testing phase. This is evidenced from the good correlation on 1:1 plot between the modeled and observed streamflows, for both the Cedar and Rainy Basins, as shown in Figs. 9.3B and 9.4B, respectively. The good performance of the neural trained network models, for both the Cedar and Rainy Basins, is further confirmed from Tables 9.1 and 9.2, respectively. Even though there seems to be no significant difference between the performance of the neural networks and the SVR models, the neural networks seem to marginally outperform the SVR models in both the Cedar and Rainy

Averaging multiclimate model Chapter | 9

251

FIGURE 9.3 Cedar basin’s streamflow prediction using neural network models.

FIGURE 9.4 Rainy basin’s streamflow prediction using neural network models.

basins as shown in Tables 9.1 and 9.2, respectively, and in Figs. 9.5 and 9.6, respectively. The difference between the two models is conspicuous especially under the phase of model testing. This is evident in both the basins where the tested neural networks performed with NSE > 95%, PBIAS < |5%|, R2 > 0.96, RMSE < 0.16 mm/day, and d1 > 0.95, whereas the tested SVR models have relatively lower performance with NSE > 82%, PBIAS < |9.9%|, R2 > 0.84, RMSE < 0.18 mm/day, and d1 > 0.80. The quadratic functionebased SVR model struggled to fit the model that averages the RCMs streamflow under the SVR framework. As a result, quadratic functionebased SVR model

252 Advances in Streamflow Forecasting

TABLE 9.1 Performance of the neural networks and the support vector regression (SVR) models used to model streamflow in Cedar Basin.

Models Training

Neural networks

SVR

Testing

Neural networks

SVR

R2

RMSE (mm/ day)

d1

0.003

1.000

0.013

0.997

99.996

0.007

1.000

0.013

0.997

DNN3

99.997

0.226

1.000

0.012

0.997

DNN4

99.996

0.026

1.000

0.012

0.997

DNN5

99.996

0.089

1.000

0.013

0.997

DNN6

99.996

0.422

1.000

0.012

0.996

DNN7

99.996

0.000

1.000

0.013

0.997

DNN8

99.996

0.026

1.000

0.013

0.997

DNN9

99.996

0.361

1.000

0.013

0.996

DNN10

99.991

0.286

1.000

0.019

0.996

RBF

99.690

4.894

0.997

0.112

0.945

Quadratic

99.459

6.930

0.996

0.148

0.913

Linear

99.676

6.504

0.997

0.115

0.943

ANN

99.923

0.251

0.999

0.022

0.991

DNN2

99.917

0.057

0.999

0.023

0.992

DNN3

99.951

0.571

1.000

0.017

0.993

DNN4

99.870

0.395

0.999

0.028

0.990

DNN5

99.937

0.681

1.000

0.020

0.993

DNN6

99.716

3.490

1.000

0.042

0.983

DNN7

96.993

4.481

0.988

0.136

0.963

DNN8

99.945

0.236

1.000

0.018

0.993

DNN9

99.550

3.103

0.998

0.052

0.983

DNN10

95.729

3.784

0.964

0.162

0.954

RBF

98.360

1.977

0.985

0.100

0.942

Quadratic

96.325

9.895

0.988

0.150

0.888

Linear

98.187

6.549

0.990

0.105

0.935

Model type

NSE (%)

ANN

99.996

DNN2

PBIAS (%)

Averaging multiclimate model Chapter | 9

253

TABLE 9.2 Performance of the neural networks and the support vector regression (SVR) models used to model streamflow in Rainy Basin.

Training

Testing

R2

d1

Models

Type

Neural networks

ANN

99.997

0.000

1.000

0.002

0.998

DNN2

99.996

0.022

1.000

0.003

0.998

DNN3

99.973

0.548

1.000

0.008

0.993

DNN4

99.997

0.090

1.000

0.003

0.998

DNN5

99.998

0.000

1.000

0.002

0.999

DNN6

99.999

0.007

1.000

0.002

0.999

DNN7

99.988

0.448

1.000

0.005

0.995

DNN8

99.998

0.129

1.000

0.002

0.999

DNN9

99.996

0.266

1.000

0.003

0.997

DNN10

99.996

0.126

1.000

0.003

0.998

RBF

99.033

0.587

0.991

0.047

0.942

Quadratic

80.041

6.918

0.816

0.215

0.760

Linear

99.219

0.749

0.993

0.042

0.946

ANN

99.993

0.159

1.000

0.004

0.997

DNN2

99.993

0.259

1.000

0.004

0.996

DNN3

99.964

0.903

1.000

0.009

0.990

DNN4

99.983

0.136

1.000

0.006

0.994

DNN5

99.992

0.058

1.000

0.004

0.996

DNN6

99.984

0.271

1.000

0.006

0.994

DNN7

99.972

0.568

1.000

0.007

0.991

DNN8

99.979

0.331

1.000

0.007

0.994

DNN9

99.950

0.700

1.000

0.010

0.992

DNN10

99.968

0.092

1.000

0.008

0.993

RBF

98.934

1.789

0.990

0.046

0.947

Quadratic

82.763

1.672

0.842

0.187

0.786

Linear

99.035

0.946

0.991

0.044

0.949

SVR

Neural networks

SVR

PBIAS (%)

RMSE (mm/ day)

NSE (%)

254 Advances in Streamflow Forecasting

FIGURE 9.5 Cedar basin’s streamflow prediction using support vector regression models.

FIGURE 9.6 Rainy basin’s streamflow prediction using support vector regression models.

had the highest bias under testing phase with a PBIAS value of 9.9% in the Cedar Basin and þ1.7% in the Rainy Basin. The RBF-based SVR models outperformed both the linear and quadratic-based SVR models in modeling streamflow. This finding is consistent with that of other hydrological studies (Achieng, 2019a,b; Lamorski et al., 2017).

9.6 Conclusions The neural networks and SVR models offer viable frameworks for averaging streamflow predicted from the multiple climate models. Even though there is

Averaging multiclimate model Chapter | 9

255

no significant difference between the performance of the neural networks and SVR modelebased averaging frameworks, the neural networks marginally outperformed the SVR in averaging streamflow from multiple climate models. It is worth mentioning that special care should be taken in training of the neural networks as these tend to get stuck in the local minimum of the gradient descent. It is suggested that the two machine learning models (neural networks and SVR) provide the viable frameworks for averaging streamflow from the multiple climate models. Both the neural networks and the SVR models do not require knowledge about the physical processes of the hydrological parameters, which are required by the process-based models such as the rainfallrunoff models. Moreover, the machine learning models can average highly nonlinear climate modeled streamflow from the multiple climate models. The SVR models do not get stuck in the local minimum because the optimization is a structural risk optimization of a ε-insensitive loss function and not the gradient descent. Besides, the SVR models only use support vectors (a subset of the training dataset) to fit the model. As SVR uses support vectors and ignores the nonsupport vectors, the SVR models not only fit much faster than the neural networks but also are able to handle noise in the training dataset. The SVR models use a C-value (the trade-off between model complexity and the model fit), which allows the SVR models to robustly handle outliers and thus prevents from overfitting during training of the SVR models.

References Achieng, K.O., 2019a. Modelling of soil moisture retention curve using machine learning techniques: artificial and deep neural networks vs support vector regression models. Comput. Geosci. 104320 https://doi.org/10.1016/J.CAGEO.2019.104320. Achieng, K.O., 2019b. Evaluating pump performance using laboratory observations and machine learning. ISH J. Hydraul. Eng. https://doi.org/10.1080/09715010.2019.1608596. Achieng, K.O., Zhu, J., 2019a. Application of Bayesian framework for evaluation of streamflow simulations using multiple climate models. J. Hydrol. 574, 1110e1128. https://doi.org/ 10.1016/j.jhydrol.2019.05.018. Achieng, K.O., Zhu, J., 2019b. Modelling groundwater recharge with multiple climate models in machine learning frameworks. In: CUAHSI Conference on Hydroinformatics July 29 - 31, 2019. Brigham Young University, Provo, Utah. Adamowski, J., Prasher, S.O., 2012. Comparison of machine learning methods for runoff forecasting in mountainous watersheds with limited data. J. Water Land Dev. 17, 89e97. https:// doi.org/10.2478/v10025-012-0012-1. Adnan, R.M., Yuan, X., Kisi, O., Adnan, M., Mehmood, A., 2018. Stream flow forecasting of poorly gauged mountainous watershed by least square support vector machine, fuzzy genetic algorithm and M5 model tree using climatic data from nearby station. Water Resour. Manag. 32, 4469e4486. https://doi.org/10.1007/s11269-018-2033-2. Adnan, R.M., Yuan, X., Kisi, O., Yuan, Y., 2017. Streamflow forecasting using artificial neural network and support vector machine models. Am. Sci. Res. J. Eng. Technol. Sci. 29, 286e294.

256 Advances in Streamflow Forecasting Agarwal, A., Singh, R.D., 2004. Runoff modelling through back propagation artificial neural network with variable rainfall-runoff data. Water Resour. Manag. 18, 285e300. https://doi.org/ 10.1023/B:WARM.0000043134.76163.b9. Antar, M.A., Elassiouti, I., Allam, M.N., 2006. Rainfall-runoff modelling using artificial neural networks technique: a Blue Nile catchment case study. Hydrol. Process. 20, 1201e1216. https://doi.org/10.1002/hyp.5932. Aqil, M., Kita, I., Yano, A., Nishiyama, S., 2007. Neural networks for real time catchment flow modeling and prediction. Water Resour. Manag. 21, 1781e1796. https://doi.org/10.1007/ s11269-006-9127-y. Asadi, H., Shahedi, K., Jarihani, B., Sidle, R., 2019. Rainfall-runoff modelling using hydrological connectivity index and artificial neural network approach. Water 11, 212. https://doi.org/ 10.3390/w11020212. Asefa, T., Kemblowski, M., McKee, M., Khalil, A., 2006. Multi-time scale stream flow predictions: the support vector machines approach. J. Hydrol. 318, 7e16. https://doi.org/10.1016/ J.JHYDROL.2005.06.001. Bae, H., Ji, H., Lim, Y.J., Ryu, Y., Kim, M.H., Kim, B.J., 2019. Characteristics of drought propagation in South Korea: relationship between meteorological, agricultural, and hydrological droughts. Nat. Hazards 99, 1e16. https://doi.org/10.1007/s11069-019-03676-3. Behzad, M., Asghari, K., Eazi, M., Palhang, M., 2009. Generalization performance of support vector machines and neural networks in runoff modeling. Expert Syst. Appl. 36, 7624e7629. https://doi.org/10.1016/j.eswa.2008.09.053. Belayneh, A., Adamowski, J., 2013. Drought forecasting using new machine learning methods. J. Water Land Dev. 18, 3e12. https://doi.org/10.2478/jwld-2013-0001. Campolo, M., Soldati, A., Andreussi, P., 2003. Artificial neural network approach to flood forecasting in the River Arno. Hydrol. Sci. J. 48, 381e398. https://doi.org/10.1623/hysj.48.3. 381.45286. Castellano-Me´ndez, M., Gonza´lez-Manteiga, W., Febrero-Bande, M., Manuel Prada-Sa´nchez, J., Lozano-Caldero´n, R., 2004. Modelling of the monthly and daily behaviour of the runoff of the Xallas river using Box-Jenkins and neural networks methods. J. Hydrol. 296, 38e58. https:// doi.org/10.1016/j.jhydrol.2004.03.011. Chan, W.C.H., Thompson, J.R., Taylor, R.G., Nay, A.E., Ayenew, T., MacDonald, A.M., Todd, M.C., 2020. Uncertainty assessment in river flow projections for Ethiopia’s Upper Awash Basin using multiple GCMs and hydrological models. Hydrol. Sci. J. https://doi.org/ 10.1080/02626667.2020.1767782. Chen, S.M., Wang, Y.M., Tsou, I., 2013. Using artificial neural network approach for modelling rainfall-runoff due to typhoon. J. Earth Syst. Sci. 122, 399e405. https://doi.org/10.1007/ s12040-013-0289-8. Chen, S.T., Yu, P.S., 2007. Real-time probabilistic forecasting of flood stages. J. Hydrol. 340, 63e77. https://doi.org/10.1016/j.jhydrol.2007.04.008. Chien, H., J-F Yeh, P., Knouft, J.H., 2013. Modeling the potential impacts of climate change on streamflow in agricultural watersheds of the Midwestern United States. J. Hydrol. 491, 73e88. https://doi.org/10.1016/j.jhydrol.2013.03.026. Chow, T.W.S., Cho, S.-Y., 2007. Neural Networks and Computing, Series in Electrical and Computer Engineering. Imperial College Press. https://doi.org/10.1142/p487. Cigizoglu, H.K., Alp, M., 2004. Rainfall-runoff modelling using three neural network methods. In: Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). Springer Verlag, pp. 166e171. https://doi.org/10.1007/978-3-540-24844-6_20.

Averaging multiclimate model Chapter | 9

257

Clair, T.A., Ehrman, J.M., 1996. Variations in discharge and dissolved organic carbon and nitrogen export from terrestrial basins with changes in climate: a neural network approach. Limnol. Oceanogr. 41, 921e927. https://doi.org/10.4319/lo.1996.41.5.0921. Crespo, J.L., Mora, E., 1993. Drought estimation with neural networks. Adv. Eng. Softw. 18, 167e170. https://doi.org/10.1016/0965-9978(93)90064-Z. Crosbie, R.S., Dawes, W.R., Charles, S.P., Mpelasoka, F.S., Aryal, S., Barron, O., Summerell, G.K., 2011. Differences in future recharge estimates due to GCMs, downscaling methods and hydrological models. Geophys. Res. Lett. 38, 1e5. https://doi.org/10.1029/ 2011GL047657. Daliakopoulos, I.N., Tsanis, I.K., 2016. Comparison of an artificial neural network and a conceptual rainfallerunoff model in the simulation of ephemeral streamflow. Hydrol. Sci. J. 61, 2763e2774. https://doi.org/10.1080/02626667.2016.1154151. Dawson, C.W., Harpham, C., Wilby, R.L., Chen, Y., 2016. Evaluation of Artificial Neural Network Techniques for Flow Forecasting in the River Yangtze, China. Hydrol. Earth Syst. Sci. 6, 619e626. https://doi.org/10.5194/hess-6-619-2002. Dawson, C.W., Wilby, R.L., 2001. Hydrological modelling using artificial neural networks. Prog. Phys. Geogr. Earth Environ. 25, 80e108. https://doi.org/10.1177/030913330102500104. Deo, R.C., Sahin, M., 2016. An extreme learning machine model for the simulation of monthly mean streamflow water level in eastern Queensland. Environ. Monit. Assess. 188, 1e24. https://doi.org/10.1007/s10661-016-5094-9. Dibike, Y.B., Velickov, S., Solomatine, D., Abbott, M.B., 2001. Model induction with support vector machines: introduction and applications. J. Comput. Civ. Eng. 15, 208e216. https:// doi.org/10.1061/(ASCE)0887-3801(2001)15:3(208). Elganiny, M.A., Eldwer, A.E., 2018. Enhancing the forecasting of monthly streamflow in the main key stations of the River Nile Basin. Water Resour. 45, 660e671. https://doi.org/10.1134/ S0097807818050135. Frank, M., Wolfe, P., 1956. An algorithm for quadratic programming. Nav. Res. Logist. Q. 3, 95e110. https://doi.org/10.1002/nav.3800030109. Ganguli, P., Janga Reddy, M., 2014. Ensemble prediction of regional droughts using climate inputs and the SVM-copula approach. Hydrol. Process. 28, 4989e5009. https://doi.org/10.1002/ hyp.9966. Ghumman, A.R., Ghazaw, Y.M., Sohail, A.R., Watanabe, K., 2011. Runoff forecasting by artificial neural network and conventional model. Alexandria Eng. J. 50, 345e350. https://doi.org/ 10.1016/j.aej.2012.01.005. Gizaw, M.S., Gan, T.Y., 2016. Regional flood frequency analysis using support vector regression under historical and future climate. J. Hydrol. 538, 387e398. https://doi.org/10.1016/ j.jhydrol.2016.04.041. Goldfarb, D., Idnani, A., 1983. A numerically stable dual method for solving strictly convex quadratic programs. Math. Program. 27, 1e33. https://doi.org/10.1007/BF02591962. Gould, N.I.M., Toint, P.L., 2002. Numerical Methods for Large-Scale Non-convex Quadratic Programming. In: Trends in Industrial and Applied Mathematics. Springer, Boston, MA, pp. 149e179. https://doi.org/10.1007/978-1-4613-0263-6_8. Graupe, D., 2013. Principles of Artificial Neural Networks, Advanced Series in Circuits and Systems. World Scientific. https://doi.org/10.1142/8868. Gutowski, W.J., Arritt, R.W., Kawazoe, S., Flory, D.M., Takle, E.S., Biner, S.S., Caya, D., Jones, R.G., Laprise, R.R., Leung, L.R., Mearns, L.O., Moufouma-Okia, W., Nunes, A.M.B., Qian, Y., Roads, J.O., Sloan, L.C., Snyder, M.a., Raymond, A., L Ruby, L., Gutowski Jr., W.J., 2010. Regional extreme monthly precipitation simulated by NARCCAP RCMs. J. Hydrometeorol. 11, 1373e1379. https://doi.org/10.1175/2010JHM1297.1.

258 Advances in Streamflow Forecasting Han, D., Chan, L., Zhu, N., 2007. Flood forecasting using support vector machines. J. Hydroinf. 09, 267e276. Hara, K., Saito, D., Shouno, H., 2015. Analysis of function of rectified linear unit used in deep learning. In: Proceedings of the International Joint Conference on Neural Networks. Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/IJCNN.2015.7280578. Holt, M.J.J., Semnani, S., 1990. Convergence of back-propagation in neural networks using a loglikelihood cost function. Electron. Lett. 26, 1964e1965. https://doi.org/10.1049/el:19901270. Honorato, A.G. da S.M., Silva, G.B.L. da, Guimara˜es Santos, C.A., 2018. Monthly streamflow forecasting using neuro-wavelet techniques and input analysis. Hydrol. Sci. J. 63, 2060e2075. https://doi.org/10.1080/02626667.2018.1552788. Hsu, K., Gupta, H.V., Sorooshian, S., 1995. Artificial neural network modeling of the rainfallrunoff process. Water Resour. Res. 31, 2517e2530. https://doi.org/10.1029/95WR01955. Huang, S., Chang, J., Huang, Q., Chen, Y., 2014. Monthly streamflow prediction using modified EMD-based support vector machine. J. Hydrol. 511, 764e775. https://doi.org/10.1016/ j.jhydrol.2014.01.062. Jain, S.K., 2012. Modeling river stage-discharge-sediment rating relation using support vector regression. Hydrol. Res. 43, 851e861. https://doi.org/10.2166/nh.2011.101. Jayawardena, A.W., Fernando, T.M.K.G., 2001. River flow prediction: an artificial neural network approach A. W. In: Regional Management of Water Resources (Proceedings of a Symposium Held during Die Sixth IAHS Scientific Assembly at Maastricht, The Netherlands, July 2001). p. IAHS Publ. no. 268. Jian, Y.L., Chun, T.C., Kwok, W.C., 2006. Using support vector machines for long term discharge prediction. Hydrol. Sci. J. 51, 599e612. https://doi.org/10.1623/hysj.51.4.599. Jiang, Y., Palash, W., Akanda, A.S., Small, D.L., Islam, S., 2016. A simple streamflow forecasting scheme for the ganges basin. In: Flood Forecasting: A Global Perspective. Elsevier Inc., pp. 399e420. https://doi.org/10.1016/B978-0-12-801884-2.00015-3 Kalteh, A.M., 2013. Monthly river flow forecasting using artificial neural network and support vector regression models coupled with wavelet transform. Comput. Geosci. 54, 1e8. https:// doi.org/10.1016/j.cageo.2012.11.015. Kang, K.W., Kim, J.H., Park, C.Y., Ham, K.J., 1993. Evaluation of hydrological forecasting system based on neural network model. In: Proceedings of the 25th Congress of the International Association for Hydraulic Research. Delft, The Netherlands, pp. 257e264. Karran, D.J., Morin, E., Adamowski, J., 2014. Multi-step Streamflow Forecasting Using DataDriven Non-linear Methods in Contrasting Climate Regimes. J. Hydroinf. 16, 671e689. https://doi.org/10.2166/hydro.2013.042. Karunanithi, N., Grenney, W.J., Whitley, D., Bovee, K., 1994. Neural networks for river flow prediction. J. Comput. Civ. Eng. 8, 201e220. https://doi.org/10.1061/(ASCE)08873801(1994)8:2(201). ¨ ., 2007. Streamflow forecasting using different artificial neural network algorithms. Kis¸i, O J. Hydrol. Eng. 12, 532e539. https://doi.org/10.1061/(ASCE)1084-0699(2007)12:5(532). Kisi, O., Cigizoglu, H.K., 2007. Comparison of different ANN techniques in river flow prediction. Civ. Eng. Environ. Syst. 24, 211e231. https://doi.org/10.1080/10286600600888565. Kisi, O., Cimen, M., 2011. A wavelet-support vector machine conjunction model for monthly streamflow forecasting. J. Hydrol. 399, 132e140. https://doi.org/10.1016/ j.jhydrol.2010.12.041. Krajewski, W.F., Ceynar, D., Demir, I., Goska, R., Kruger, A., Langel, C., Mantilllla, R., Niemeier, J., Quintero, F., Seo, B.C., Smallll, S.J., Weber, L.J., Young, N.C., 2017. Real-time flood forecasting and information system for the state of Iowa. Bull. Am. Meteorol. Soc. 98, 539e554. https://doi.org/10.1175/BAMS-D-15-00243.1.

Averaging multiclimate model Chapter | 9

259

 unek, J., Sławinski, C., Lamorska, J., 2017. An estimation of the main Lamorski, K., Sim wetting branch of the soil water retention curve based on its main drying branch using the machine learning method. Water Resour. Res. 53, 1539e1552. https://doi.org/10.1002/ 2016WR019533. Liong, S.Y., Sivapragasam, C., 2002. Flood stage forecasting with support vector machines. J. Am. Water Resour. Assoc. 38, 173e186. https://doi.org/10.1111/j.1752-1688.2002.tb01544.x. Loke, E., Warnaars, E.A., Jacobsen, P., Nelen, F., Do Ce´u Almeida, M., 1997. Artificial neural networks as a tool in urban storm drainage. In: Water Science and Technology. Elsevier Science Ltd, pp. 101e109. https://doi.org/10.1016/S0273-1223(97)00612-4. Londhe, S., Charhate, S., 2010. Comparaison de techniques de mode´lisation conditionne´e par les donne´es pour la pre´vision des de´bits fluviaux. Hydrol. Sci. J. 55, 1163e1174. https://doi.org/ 10.1080/02626667.2010.512867. Londhe, S., Gavraskar, S.S., 2015. Forecasting one day ahead stream flow using support vector regression. Aquat. Proced. 4, 900e907. https://doi.org/10.1016/j.aqpro.2015.02.113. Lorrai, M., Sechi, G.M., 1995. Neural nets for modelling rainfall-runoff transformations. Water Resour. Manag. 9, 299e313. https://doi.org/10.1007/BF00872489. Luo, X., Yuan, X., Zhu, S., Xu, Z., Meng, L., Peng, J., 2019. A hybrid support vector regression framework for streamflow forecast. J. Hydrol. 568, 184e193. https://doi.org/10.1016/ j.jhydrol.2018.10.064. Machado, F., Mine, M., Kaviski, E., Fill, H., 2011. Monthly rainfallerunoff modelling using artificial neural networks. Hydrol. Sci. J. 56, 349e361. https://doi.org/10.1080/ 02626667.2011.559949. Maity, R., Bhagwat, P.P., Bhatnagar, A., 2010. Potential of support vector regression for prediction of monthly streamflow using endogenous property. Hydrol. Process. 24, 917e923. https:// doi.org/10.1002/hyp.7535. McKee, T.B., Doesken, N.J., Kleist, J., 1993. The relationship of drought frequency and duration of time scales, In: Eighth Conference on Applied Climatology. American Meteorological Society. 17, 179e183. Mearns, L.O., Arritt, R., Biner, S., Bukovsky, M.S., McGinnis, S., Sain, S., Caya, D., Correia Jr., J., Flory, D., Gutowski, W., Takle, E.S., Jones, R., Leung, R., Moufouma-Okia, W., McDaniel, L., Nunes, A.M.B., Qian, Y., Roads, J., Sloan, L., Snyder, M., Correia, J., Flory, D., Gutowski, W., Takle, E.S., Jones, R., Leung, R., Moufouma-Okia, W., McDaniel, L., Nunes, A.M.B., Qian, Y., Roads, J., Sloan, L., Snyder, M., 2012. The North American regional climate change assessment program dataset. Bull. Am. Meteorol. Soc. 93, 1337e1362. https://doi.org/ 10.1175/BAMS-D-11-00223.1. Misra, D., Oommen, T., Agarwal, A., Mishra, S.K., Thompson, A.M., 2009. Application and analysis of support vector machine based simulation for runoff and sediment yield. Biosyst. Eng. 103, 527e535. https://doi.org/10.1016/j.biosystemseng.2009.04.017. NARCCAP, 2007. North American Regional Climate Change Assessment Program (NARCCAP): Data Tables [WWW Document]. URL. http://www.narccap.ucar.edu/data/data-tables.html. (Accessed 4 December 2019). Nilsson, P., Uvo, C.B., Berndtsson, R., 2006. Monthly runoff simulation: comparing and combining conceptual and neural network models. J. Hydrol. 321, 344e363. https://doi.org/ 10.1016/j.jhydrol.2005.08.007. Noori, R., Deng, Z., Asce, M., Amin, K., Kachoosangi, F.T., 2015. How Reliable Are ANN, ANFIS, and SVM Techniques for Predicting Longitudinal Dispersion Coefficient in Natural Rivers? J. Hydraul. Eng. ASCE 142, 04015039. https://doi.org/10.1061/(ASCE)HY.19437900.0001062.

260 Advances in Streamflow Forecasting Partal, T., 2009. River flow forecasting using different artificial neural network algorithms and wavelet transform. Can. J. Civ. Eng. 36, 26e38. https://doi.org/10.1139/L08-090. Peng, T., Zhou, J., Zhang, C., Fu, W., 2017. Streamflow forecasting using empirical wavelet transform and artificial neural networks. Water 9, 406. https://doi.org/10.3390/w9060406. Prakash, O., Sudheer, K.P., Srinivasan, K., 2014. Improved higher lead time river flow forecasts using sequential neural network with error updating. J. Hydrol. Hydromech. 62, 60e74. https://doi.org/10.2478/johh-2014-0010. Raman, H., Sunilkumar, N., 1995. Multivariate modelling of water resources time series using artificial neural networks. Hydrol. Sci. J. 40, 145e163. https://doi.org/10.1080/ 02626669509491401. Rezaei, M., Motlaq, A.A.A., Mahmouei, A.R., Mousavi, S.H., 2015. River flow forecasting using artificial neural network (Shoor Ghaen). Cieˆncia e Nat 37, 207. https://doi.org/10.5902/ 2179460x20849. Riad, S., Mania, J., Bouchaou, L., Najjar, Y., 2004. Rainfall-runoff model usingan artificial neural network approach. Math. Comput. Model. 40, 839e846. https://doi.org/10.1016/ j.mcm.2004.10.012. Sarajedini, A., Hecht-Nielsen, R., Chau, P.M., 1999. Conditional probability density function estimation with sigmoidal neural networks. IEEE Trans. Neural Network. 10, 231e238. https://doi.org/10.1109/72.750544. Scho¨lkopf, B., Sung, K.-K., Burges, C., Girosi, F., Niyogi, P., Poggio, T., Vapnik, V., 1997. Comparing support vector machines with Gaussian kernels to radial basis function classifiers. IEEE Trans. Signal Process. 45, 1e8. https://doi.org/10.1109/78.650102. Senthil Kumar, A.R., Sudheer, K.P., Jain, S.K., Agarwal, P.K., 2005. Rainfall-runoff modelling using artificial neural networks: comparison of network types. Hydrol. Process. 19, 1277e1291. https://doi.org/10.1002/hyp.5581. Shafaei, M., Kisi, O., 2017. Predicting river daily flow using wavelet-artificial neural networks based on regression analyses in comparison with artificial neural networks and support vector machine models. Neural Comput. Appl. 28, 15e28. https://doi.org/10.1007/ s00521-016-2293-9. Shahbazi, A.N., Zahrair, B., Mohsen, N., 2011. Seasonal meteorological drought prediction using support vector machine. World Appl. Sci. J. 13, 1387e1397. https://idosi.org/wasj/wasj13(6)/ 16.pdf. Shamseldin, A.Y., 2010. Artificial neural network model for river flow forecasting in a developing country. J. Hydroinf. 12, 22e35. https://doi.org/10.2166/hydro.2010.027. She, N., Basketfield, D., 2005. Long range forecast of streamflow using support vector machine. In: Impacts of Global Climate Change. American Society of Civil Engineers, Reston, VA, pp. 1e9. https://doi.org/10.1061/40792(173)481. Shortridge, J.E., Guikema, S.D., Zaitchik, B.F., 2016. Machine learning methods for empirical streamflow simulation: a comparison of model accuracy, interpretability, and uncertainty in seasonal watersheds. Hydrol. Earth Syst. Sci. 20, 2611e2628. https://doi.org/10.5194/hess-202611-2016. Shrestha, R.R., Dibike, Y.B., Prowse, T.D., 2012. Modelling of climate-induced hydrologic changes in the Lake Winnipeg watershed. J. Great Lake. Res. 38, 83e94. https://doi.org/ 10.1016/j.jglr.2011.02.004. Singh, S., Jain, S., Ba´rdossy, A., 2014. Training of artificial neural networks using information-rich data. Hydrology 1, 40e62. https://doi.org/10.3390/hydrology1010040. Sivapragasam, C., Liong, S.Y., Pasha, M.F.K., 2001. Rainfall and runoff forecasting with SSASVM approach. J. Hydroinf. 3, 141e152. https://doi.org/10.2166/hydro.2001.0014.

Averaging multiclimate model Chapter | 9

261

Smola, A.J., Sch€olkopf, B., Sch€olkopf, S., 2004. A tutorial on support vector regression. Stat. Comput. 14 (3), 199e222. https://doi.org/10.1023/B:STCO.0000035301.49549.88. Sohail, A., Watanabe, K., Takeuchi, S., 2008. Runoff analysis for a small watershed of tono area Japan by back propagation artificial neural network with seasonal data. Water Resour. Manag. 22, 1e22. https://doi.org/10.1007/s11269-006-9141-0. Takle, E.S., Manoj, J.H.A., Lu, E.R., Arritt, R.W., Gutowski, W.J., 2010. Streamflow in the upper Mississippi river basin as simulated by SWAT driven by 20th Century contemporary results of global climate models and NARCCAP regional climate models. Meteorol. Z. 19, 341e346. https://doi.org/10.1127/0941-2948/2010/0464. Thirumalaiah, K., Deo, M.C., 1998. Real-time flood forecasting using neural networks. Comput. Civ. Infrastruct. Eng. 13, 101e111. https://doi.org/10.1111/0885-9507.00090. Tokar, A.S., Markus, M., 2000. Precipitation-runoff modeling using artificial neural networks and conceptual models. J. Hydrol. Eng. ASCE 5, 156e161. https://doi.org/10.1061/(ASCE)10840699(2000)5:2(156). Tongal, H., Booij, M.J., 2018. Simulation and forecasting of streamflows using machine learning models coupled with base flow separation. J. Hydrol. 564, 266e282. https://doi.org/10.1016/ J.JHYDROL.2018.07.004. Toth, E., Brath, A., 2002. Flood forecasting using artificial neural networks in black-box and conceptual rainfall-runoff modelling. Int. Congr. Environ. Model. Softw. 106. Van Loon, A.F., 2015. Hydrological drought explained. Wiley Interdiscip. Rev. Water 2, 359e392. https://doi.org/10.1002/wat2.1085. Veintimilla-Reyes, J., Cisneros, F., Vanegas, P., 2016. Artificial neural networks applied to flow prediction: a use case for the Tomebamba river. In: Procedia Engineering. Elsevier Ltd, pp. 153e161. https://doi.org/10.1016/j.proeng.2016.11.031. Venkata Ramana, R., Krishna, B., Kumar, S.R., Pandey, N.G., 2013. Monthly rainfall prediction using wavelet neural network analysis. Water Resour. Manag. 27, 3697e3711. https://doi.org/ 10.1007/s11269-013-0374-4. Veres, M., Lacey, G., Taylor, G.W., 2015. Deep learning architectures for soil property prediction. In: 2015 12th Conference on Computer and Robot Vision. IEEE, pp. 8e15. https://doi.org/ 10.1109/CRV.2015.15. Wagena, M.B., Goering, D., Collick, A.S., Bock, E., Fuka, D.R., Buda, A., Easton, Z.M., 2020. Comparison of short-term streamflow forecasting using stochastic time series, neural networks, process-based, and Bayesian models. Environ. Model. Softw. 126, 104669. https:// doi.org/10.1016/j.envsoft.2020.104669. Wang, W.C., Chau, K.W., Cheng, C.T., Qiu, L., 2009. A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series. J. Hydrol. 374, 294e306. https://doi.org/10.1016/j.jhydrol.2009.06.019. Young, C.C., Liu, W.C., Wu, M.C., 2017. A physically based and machine learning hybrid approach for accurate rainfall-runoff modeling during extreme typhoon events. Appl. Soft Comput. J. 53, 205e216. https://doi.org/10.1016/j.asoc.2016.12.052. Yu, P.S., Chen, S.-T., Chang, I.-F., 2008. Real-time flood stage forecasting using support vector regression. In: Practical Hydroinformatics. Springer Berlin Heidelberg, pp. 359e373. https:// doi.org/10.1007/978-3-540-79881-1_26. Yu, P.S., Chen, S.T., Chang, I.F., 2006. Support vector regression for real-time flood stage forecasting. J. Hydrol. 328, 704e716. https://doi.org/10.1016/j.jhydrol.2006.01.021.

262 Advances in Streamflow Forecasting Yu, X., Liong, S.Y., Babovic, V., 2004. EC-SVM approach for real-time hydrologic forecasting. J. Hydroinf. 6, 209e223. https://doi.org/10.2166/hydro.2004.0016. Yu, X., Wang, Y., Wu, L., Chen, G., Wang, L., Qin, H., 2020. Comparison of support vector regression and extreme gradient boosting for decomposition-based data-driven 10-day streamflow forecasting. J. Hydrol. 582, 124293. https://doi.org/10.1016/j.jhydrol.2019.124293. Zadeh, M.R., Amin, S., Khalili, D., Singh, V.P., 2010. Daily outflow prediction by multi layer perceptron with logistic sigmoid and tangent sigmoid activation functions. Water Resour. Manag. 24, 2673e2688. https://doi.org/10.1007/s11269-009-9573-4. Zakaria, Z.A., Shabri, A., 2012. Streamflow forecasting at ungaged sites using support vector machines. Appl. Math. Sci. 6, 3003e3014. Zealand, C.M., Burn, D.H., Simonovic, S.P., 1999. Short term streamflow forecasting using artificial neural networks. J. Hydrol. 214, 32e48. https://doi.org/10.1016/S0022-1694(98)00242-X. Zhou, T., Wang, F., Yang, Z., Zhou, T., Wang, F., Yang, Z., 2017. Comparative analysis of ANN and SVM models combined with wavelet preprocess for groundwater depth prediction. Water 9, 781. https://doi.org/10.3390/w9100781. Zhu, S., Zhou, J., Ye, L., Meng, C., 2016. Streamflow estimation by support vector machine coupled with different methods of time series decomposition in the upper reaches of Yangtze River, China. Environ. Earth Sci. 75, 1e12. https://doi.org/10.1007/s12665-016-5337-7.

Chapter 10

Short-term flood forecasting using artificial neural networks, extreme learning machines, and M5 model tree Mukesh K. Tiwari1, Ravinesh C. Deo2, Jan F. Adamowski3 1 Department of Irrigation and Drainage Engineering, College of Agricultural Engineering and Technology, Anand Agricultural University, Godhra, Gujarat, India; 2School of Agricultural Computational and Environmental Sciences, International Centre of Applied Climate Sciences (ICACS), University of Southern Queensland, Springfield, QLD, Australia; 3Department of Bioresource Engineering, Faculty of Agricultural and Environmental Science, McGill University, Montreal, QC, Canada

10.1 Introduction Hourly flood forecasting is very important for issuing flood warnings, implementing flood prevention measures, and formulating flood evacuation plans and rehabilitation actions. Many rainfall-runoff and flood forecasting models have been developed and applied over a range of studies. Empirical, conceptual, physical, and data-driven approaches have been utilized to map the nonlinear relationship between rainfall and runoff (Adamala et al., 2014a,b, 2015; Deo et al., 2017; Mouatadid and Adamowski, 2017; Paudel et al., 2011; Sehgal et al., 2014; Tiwari et al., 2013; Tiwari and Adamowski, 2013; Verma et al., 2010). Over the last few years, the complex nature of water resources variables has brought increased attention to the potential of soft computing techniques such as artificial neural network (ANN) models. ANN models mimic how the human brain functions and have been widely applied as an effective method for modeling highly nonlinear phenomena in hydrological processes (Coulibaly et al., 2000; Hsu et al., 1995; Kis¸i, 2004). ANNs have been widely used in hydrological forecasting (e.g., Boucher et al., 2020; Nourani et al., 2014).

Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00012-3 Copyright © 2021 Elsevier Inc. All rights reserved.

263

264 Advances in Streamflow Forecasting

Recently, the extreme learning machine (ELM) model has gained wide popularity in hydrological forecasting as an artificial intelligence approach that improves on the classical ANN model (Abdullah et al., 2015; Atiquzzaman and Kandasamy, 2016; Deo and Sahin, 2016; Prasad et al., 2019; Wu et al., 2019). ELM modeling, where input weights and hidden layer biases are randomly assigned, requires significantly less computational time than ANN modeling for training datasets (Huang et al., 2006). Consequently, the forecasting problem is reduced to a linear system of equations whose output weights are analytically determined with the least squares method (Huang et al., 2006, 2012). The ELM model satisfies the universal approximation condition with a good generalization performance (Huang, 2003; Huang et al., 2015) and as such, it is a suitable machine learning approach for hydrological forecasting. Taormina et al. (2015) explored the applicability of the ELM model in baseflow separation with modular models (MMs) and global models (GMs) using data from nine different gauging stations in the northern United States. ELM was applied instead of ANN for the development of both the GM and MM models. Li and Cheng (2014) applied a wavelet-based ELM (W-ELM) model for monthly discharge forecasting in two reservoirs in southwestern China. Results revealed that the ELM-based single-layer feedforward neural network (SLFN) performed better than the support vector machine (SVR) for the estimation of peak discharges, whereas, for the overall discharge dataset, WNN-SLFN hybrid model provided a more accurate forecast than the single standalone SLFN and SVR models. Yin et al. (2014) developed an ELM-based modular mechanism for real-time tidal forecasting in Canada. Nurhayati et al. (2015) modeled groundwater flow in reclaimed lowlands of South Sumatra, Indonesia, using ELM. Abdullah et al. (2015) showed that the ELM model is efficient, simple in application, fast, and possesses good generalization performance for simulating PenmaneMonteith evapotranspiration, which was applied at the Mosul, Baghdad, and Basrah meteorological stations, located in northern, central, and southern Iraq. M5 model tree is another AI technique that has been found to be useful in solving many hydrological problems. Bhattacharya and Solomatine (2005) successfully applied the M5 model tree to develop a water level discharge relationship at a discharge measuring station located in Swarupgunj on the  River Bhagirathi, India. Stravs and Brilly (2007) used the M5 model tree to predict streamflows for seven Slovenian tributaries of the Sava River in Slovenia. The results indicated that M5 model tree was useful for streamflow prediction during rainless periods. Taghi Sattari et al. (2013) indicated that the M5 model tree performed well for daily streamflow prediction in Sohu Stream, Turkey. Adnan et al. (2018) used least square support vector machine, fuzzy genetic algorithm, and M5 model tree and climatic data from nearby stations to forecast streamflows in a poorly gauged mountainous watershed located in the Hunza River basin, located in the high mountainous area of central Karakoram, Pakistan.

Short-term flood forecasting using artificial Chapter | 10

265

Given the usefulness and wide use of ANN, ELM, and M5 model tree models in hydrological forecasting, this chapter provides brief theoretical details of ANN, ELM, and M5 model tree methods and compares their applications in short-term flood forecasting for the Mary River catchment, Australia, as a case study.

10.2 Theoretical background 10.2.1 Artificial neural networks ANN models represent a class of nonlinear regression techniques that mimic the biological function of the nervous system (Abiodun et al., 2018; Haykin, 2010). ANN models are known for their highly interconnected framework that transmits information from the input layer through weighted connections and functional nodes called transfer functions. These transfer functions facilitate the nonlinear mapping of the data to high-dimensional hyperplanes, which allow for the separation of the data patterns and derivation of a model output. The ANN model is defined as a form of nonlinear regression model that performs an inputeoutput mapping using a set of feedforward neural networks. It consists of an input layer, one or more hidden layers of computation nodes, and an output layer of computation nodes. The ANN has been found to be fast and efficient in noisy environments and has been employed to solve a wide range of problems (Abiodun et al., 2018; Adamala et al., 2014b; Deo et al., 2017; Koradia et al., 2019; Kumar et al., 2015; Mouatadid and Adamowski, 2017; Tiwari and Adamowski, 2017). With their advantages, ANN models have become a widely used AI technique in numerous real-world applications such as time series forecasting. An output node of an ANN model is presented by the following expression (Haykin, 2010; Makwana and Tiwari, 2014): ! J h X X yn ¼ wjk j wij xi (10.1) j¼1

i¼1

where wij and wjk are the connection weights, whose values are optimized during model training; j is a generally sigmoidal type of activation function; h and J are the respective number of input and hidden layers, and xi is the model input variable. More details about the properties of ANN models and their applications in water resources engineering are available in the literature (Abrahart et al., 2012; Cook, 2020; Haykin, 2010; Maier and Dandy, 2000; Ripley, 2007).

10.2.2 Extreme learning machines The ELM model is an SLFN algorithm that operates in a similar manner to the feedforward backpropagation artificial neural network (FFBP-ANN) and least

266 Advances in Streamflow Forecasting

square support vector regression (LSSVR) models (Deo et al., 2017). However, performance of the ELM model is generally better than that of the FFBPANN and LSSVR models in solving regression problems with shorter modeling times (Huang et al., 2015). In the ELM model algorithm, the input weights (and biases) are randomized, and hence, the output weights have a unique least squares solution solved by the MooreePenrose generalized inverse function (Huang et al., 2006). This often results in improved forecasting performance of the ELM model in comparison to the FFBP-ANN and LSSVR models (Acharya et al., 2014; Deo and Sahin, 2015a,b; Huang et al., 2012). The ELM model is a simple three-step procedure requiring no parameterization except for the randomization of hidden neurons (weights and biases) that may be generated from a continuous probability distribution (e.g., the uniform distribution, normal distribution, triangular distribution, etc.). The ELM model has advantages over conventional neural network models (e.g., FFBP-ANN) with their slow convergence speeds, issues with local minima (e.g., nonglobal solutions for the model parameters), overfitting problems, and iterative tuning of model parameters (Deo et al., 2017). An ELM model is developed following four steps: (i) determining the network architecture (i.e., the inputs and number of hidden layer neurons H), (ii) selecting a suitable activation function that may be piecewise continuous but must be infinitely differentiable, (iii) assigning the random input weights and biases from a continuous probability distribution, and (iv) solving a linear system of equations. Given an inputeoutput dataset, X (model inputs/explanatory variables) and Y (target output/response variable) comprised of N training data samples, i.e., ðxt ; yt Þ for t ¼ 1; 2; .; N, where xt ¼ ½xt1 ; xt2 ; .; xin T ˛ Rn and yt ¼ T  yt1 ; yt2; .; ytm ˛ Rm , an SLFN model with H number of hidden nodes and activation function gðxÞ is mathematically presented as (Huang et al., 2006): H X

  bt g wt $ xj þ bt ¼ ot

(10.2)

t¼1

where j ¼ 1, .,N; wt ¼ ½wt1 ; wt2 ; .; wtn T is the connection weight matrix between the hidden and input layer; bt ¼ ½bt1 ; bt2 ; .; btm T is the weight matrix connecting the hidden nodes with the output nodes and the model output, O (ot ˛ R); bt is the threshold  of thetth hidden node. wt $xt is the inner product between xt and wt . g wt ; xj ; bt is the hidden layer activation function (set as the log-sigmoid for the case study presented in this chapter, i.e., g ˛ R  0). The log-sigmoid is often selected as the activation function over other similar functions (e.g., bipolar logistic, hyperbolic tangent function, Elliott sigmoid, or radial basis function) as this form of activation function has commonly been adopted in studies dealing with hydrological forecasting (Adamowski et al., 2012; Deo and Sahin, 2015a,b; Quilty et al., 2016).

Short-term flood forecasting using artificial Chapter | 10

267

Likewise, the linear transfer function in the output layer is often selected because nonlinear functions often fail to improve performance (Akusok et al., 2015). However, in a future study, the detailed effects of alternative activation functions based on sine, tangent sigmoid, radial basis, and triangular basis, among others (Acharya et al., 2014; Deo and Sahin, 2015a,b; Tiwari and Chatterjee, 2010), could be investigated, although these functions are less common in hydrology. Huang et al. (2006) showed that the expression of the SLFN model is N P approximated as ot  yt ¼ 0 for N number of training datasets with zero t¼1

error, and the network parameter exists as presented in Eq. 10.(2). This equation leads to the realization that the network parameters (wt ; xj ; bt ) can be determined analytically for a given training sample (wt and bt are sampled from a continuous probability distribution). Following this, b is estimated directly from the N inputeoutput samples as a linear system of equations of the following (matrix notational) form (Huang et al., 2006, 2015): Gb ¼ Y 2 6 where; G ¼ 6 4

gðw1 $x1 þ b1 Þ

/

« gðw1 $xN þ b1 Þ

/ /

2

bT1

3

(10.3) gðwH $x1 þ bH Þ

3

7 7 « 5 gðwH $xN þ bH Þ

(10.4)

NH

2

yT1

3

6 7 6 7 7 7 b¼ 6 and Y ¼ 6 4 « 5 4 « 5 yTN Nm bTH Hm

(10.5)

where G is known as the hidden layer output matrix and T represents the transpose of the matrix/vector. Thus, the hidden layer matrix is inverted to estimate the output weights using the MooreePenrose generalized inverse function (y) (Huang et al., 2006): b b ¼ Gy Y

(10.6)

where b b represents the estimated output weights from the N data samples. For a new input vector (xnew ), i.e., outside of the training dataset, an ELM forecast (b y ) is obtained by (Akusok et al., 2015): yb ¼

H X i¼1

b b i gi ðwi $ xnew þ bi Þ

(10.7)

268 Advances in Streamflow Forecasting

10.2.3 M5 model tree A model tree is used for numerical prediction of a variable and stores a linear regression model at each leaf that predicts the class value of instances that reach the leaf (Prasad et al., 2017). Application of the M5 model tree involves determination of the best attribute that splits the portion T of the training data that reaches a particular node by using a splitting criterion. The standard deviation of the portion of the dataset in T is treated as a measure of the error at that node. Each attribute at that particular node is tested by calculating the expected reduction in error. The attribute chosen for splitting maximizes the expected error reduction at that node. Standard deviation reduction (SDR) is the expected error reduction, which is calculated from (Bhattacharya and Solomatine, 2005): SDR ¼ sdðTÞ 

X jTi j jTj

 sdðTi Þ

(10.8)

where Ti corresponds to T1, T2, T3 . datasets that result from splitting the node according to the chosen attribute. The linear regression models at the leaves predict continuous numeric attributes. It is very similar to piecewise linear functions, and when finally they are combined, a nonlinear function is formed (Bhattacharya and Solomatine, 2005). The performance of the model is measured by different performance indicators by analyzing how closely the observed values are simulated for the testing or unseen dataset (Onyari and Ilunga, 2013). For detailed descriptions related to the M5 model tree, readers may refer to Bhattacharya and Solomatine (2005) and Quinlan (1992).

10.3 Application of ANN, ELM, and M5 model tree techniques in hourly flood forecasting: a case study Application of the ANN, ELM, and M5 model tree in hourly flood forecasting is explored for a gauge station in the Mary River catchment of Australia. The stream gauging station at Miva (138001A) is situated in the Mary basin in the Central Queensland region and has a catchment area of 4755 km2 (Waters, 2014), with a subtropical climate and a coastal catchment type (Brodie and Khan, 2016).

10.3.1 Study area and data Hourly flow data for this gauge station were downloaded from the water monitoring information portal Queensland Portal (https://water-monitoring. information.qld.gov.au/). Six years of hourly flow data from the gauge station

Short-term flood forecasting using artificial Chapter | 10

269

(138001A) were used for model development and evaluation purposes. The first 4 years of hourly time series data (2009e12), having 35054 hourly data points, were used for model development and the remaining 2 years of hourly time series data (2013 and 2014), having 8750 and 8727 hourly data points, respectively, were applied for cross-validation (2013) and evaluation (2014) of the developed models.

10.3.2 Methodology All three models were developed using the hourly dataset (4 years for training and 1 year of data each for cross-validation and testing of the models). The training dataset was applied for the development of different models and the cross-validation dataset was applied to check for overfitting during training. The testing dataset was applied to test the performance of the developed models for the unseen or new dataset. Forecasts were made for 1-, 5-, and 10-h lead times. The codes were developed in Matlab (The Mathworks Inc, 2018) for all three models. Based on the autocorrelation statistical analysis, only 1-h lag data were applied to develop forecasts at different lead times. Performances of the developed models were evaluated using the following indicators: coefficient of determination (R2), NasheSutcliffe efficiency (NSE), root mean square error (RMSE), percent peak deviation (Pdv), and mean absolute error (MAE) (Sehgal et al., 2014; Tiwari and Adamowski, 2015). A basic flowchart showing the methodology is presented in Fig. 10.1.

10.4 Results and discussion All the three models developed in this study used 6 years of dataset (2009e14). The data applied in this study were divided into training, crossvalidation, and testing stages of the developed models. First, 4 years of dataset were applied for the training of the models, 1 year of dataset for crossvalidation, and the remaining 1 year for testing the performance of the developed models. All the three models were trained using the training dataset with a follow-up technique called cross-validation using the cross-validation dataset to check and avoid overfitting of the models, with the aim of developing models with better generalization capabilities. The details of the dataset used in this study along with some basic statistics of the dataset are presented in Table 10.1. Values of the five performance evaluation criteria (i.e., R2, NSE, RMSE, Pdv, and 0) for the ANN, ELM, and M5 model tree models for 1-h lead time forecasts are presented in Table 10.2. Model performances are also shown graphically by plotting hydrographs of the observed and model-predicted river

270 Advances in Streamflow Forecasting

FIGURE 10.1 A flowchart showing the step-by-step procedure for applying the three data-driven models, i.e., artificial neural network, extreme learning machines, and M5 model tree in flood forecasting.

flows for 1-h lead time flood forecasts (Fig. 10.2). In terms of R2, NSE, RMSE, and Pdv ELM and M5 model tree had similar values of 0.999, 99.9%, 221.2 ML/day, and 0%, respectively, and performed better than ANN models that had values of 0.994, 79.5%, 2666.7 ML/day, and 1.9%, respectively. On

Short-term flood forecasting using artificial Chapter | 10

271

TABLE 10.1 Basic statistics of river flow data from the Mary River catchment, Australia (2009e14).

Data

Duration (year)

No. of data points (hourly)

Minimum (ML/day)a

Maximum (ML/day)

Standard deviation (ML/day)

Training

4

35,064

0.15

565,546.71

28,529.71

Crossvalidation

1

8,760

19.71

653,695.87

56,806.63

Testing

1

8,737

0.00

78,765.18

5882.81

a

Megaliters per day.

TABLE 10.2 Values of performance evaluation indicators of the developed artificial neural network (ANN), extreme learning machine (ELM), and M5 model tree for flood forecasting at 1-h lead time. Model

R2

NSE (%)

RMSE (ML/day)

Pdv (%)

MAE (%)

ANN

0.994

79.5

2666.7

1.9

2628.6

ELM

0.999

99.9

221.2

0.0

29.2

M5 model tree

0.999

99.9

221.2

0.0

29.1

MAE, mean absolute error; NSE, NasheSutcliffe efficiency; Pdv, percent peak deviation; R2, coefficient of determination; RMSE, root mean square error.

the other hand, in terms of MAE, the performance of M5 model tree (MAE ¼ 29.1%) was better than ELM (MAE ¼ 29.2%), followed by ANN (MAE ¼ 2628.6%) for 1-h lead flood flow forecasting. Overall, in terms of different performance indicators, the performance of the ELM and M5 model tree was found to be better than the performance of the ANN model for 1-h lead time flood flow forecasts. It was observed that the performance of the ANN model was inferior because it did not simulate the lower flood values satisfactorily compared to the other two models. This is clearly depicted in scatter plots shown for all three modeling techniques in Fig. 10.2. These scatter plots show that ELM and M5 models perform very well in flood forecasting, and that the predicted model values are very well associated with respect to the observed values of 1:1 line compared to values predicted by the ANN models.

272 Advances in Streamflow Forecasting

FIGURE 10.2 Time plots (left side) and scatter plots (right side) of observed and modelpredicted forecasts made by (A) artificial neural network, (B) extreme learning machine, and (C) M5 model tree models for flood forecasting at a 1-h lead time.

In addition to developing models for 1-h lead time flood forecasts, the modeling capabilities of the ANN, ELM, and M5 model tree were also assessed at higher (5- and 10-h) lead time flood forecasts. The results for 5-h lead time forecasts are presented in Table 10.3 and Fig. 10.3. It is observed in

Short-term flood forecasting using artificial Chapter | 10

273

TABLE 10.3 Values of performance evaluation indicators of the developed artificial neural network (ANN), extreme learning machine (ELM), and M5 model tree for flood forecasting at a 5-h lead time. Model

R2

NSE (%)

RMSE (ML/day)

Pdv (%)

MAE (%)

ANN

0.961

90.4

1823.8

3.7

1201.9

ELM

0.984

96.7

1062.7

0.9

191.7

M5 model tree

0.983

96.6

1092.6

0.3

159.8

MAE, mean absolute error; NSE, NasheSutcliffe efficiency; Pdv, percent peak deviation; R2, coefficient of determination; RMSE, root mean square error.

terms of five performance indices that as the lead time increased, the performance of the models deteriorated. In terms of R2, NSE, RMSE, Pdv, and MAE, two models (ELM and M5 model tree) had quite similar performances with values of 0.984, 96.7%, 1062.7 ML/day, 0.9%, and 191.7% for ELM models and values of 0.983, 96.6%, 1092.6 ML/day, 0.3%, and 159.8% for M5 model tree, respectively, and performed better than the ANN model that had values of 0.961, 90.4%, 1823.8 ML/day, 3.7%, and 1201.9%, respectively, for 5-h lead flood flow forecasting. Overall, the performance of the ELM and M5 model tree for flood forecasting at a 5-h lead time was found to be better than that of the ANN model. All the three models were also developed for flood flow forecasting for a 10-h lead time. The performance of the three models for a 10-h lead time flood forecast in terms of the performance evaluation indicators is presented in Table 10.4. It can be observed that similar to model performance at a 5-h lead time, the performance of models at a 10-h lead time deteriorated with the increased lead time. In terms of R2, NSE, RMSE, and Pdv, the three models (ANN, ELM, and M5 model tree) had very close values of 0.940, 87.9%, 2051.6 ML/day, and 4.2; 0.941, 88.3%, 2009.0 ML/day, and 2.8%; and 0.936, 87.5%, 2083.7 ML/day, and 5.5%, respectively, whereas, in terms of MAE, the performance of the M5 model tree (MAE ¼ 340.7%) was better than the ELM model (MAE ¼ 457.7%) followed by the ANN model (MAE ¼ 702.5%) at 10-h lead flood flow forecasting. Line plots over time and scatter plots over the 1:1 line of the observed and predicted values of the river flows using the ANN, ELM, and M5 model tree for 10-h lead time flood forecasts are

274 Advances in Streamflow Forecasting

FIGURE 10.3 Time plots (left side) and scatter plots (right side) of observed and modelpredicted forecasts made by (A) artificial neural network, (B) extreme learning machine, and (C) M5 model tree models for flood forecasting at a 5-h lead time.

presented in Fig. 10.4. It is clear that the performance of all the three models is inferior compared with the performances of the models for 1-h and 5-h lead times. This is because the values are more widespread on a 1:1 line for the 10-h lead time as compared to the lesser lead time forecasting periods. In terms of the statistical and graphical performance evaluation indicators, all three models gave similar overall performances.

TABLE 10.4 Values of performance evaluation indicators of the developed artificial neural network (ANN), extreme learning machine (ELM), and M5 model tree for flood forecasting at 10-h lead time. Model

R2

NSE (%)

RMSE (ML/day)

Pdv (%)

MAE (%)

ANN

0.940

87.9

2051.6

4.2

702.5

ELM

0.941

88.3

2009.0

2.8

457.7

M5 model tree

0.936

87.5

2083.7

5.5

340.7

MAE, mean absolute error; NSE, NasheSutcliffe efficiency; Pdv, percent peak deviation; R2, coefficient of determination; RMSE, root mean square error.

FIGURE 10.4 Time plots (left side) and scatter plots (right side) of observed and modelpredicted forecasts made by (A) artificial neural network, (B) extreme learning machines, and (C) M5 model tree models for flood forecasting at a 10-h lead time.

276 Advances in Streamflow Forecasting

10.5 Conclusions This chapter dealt with recent advances in the application of soft computing techniques such as ANN, ELM, and M5 model tree for hourly flood forecasting at 1-, 5-, and 10-h lead times. The ANN technique has been widely applied for flood forecasting while the more recently developed ELM and M5 model tree have gained wide popularity in hydrological forecasting as improved artificial intelligence paradigms requiring less computational time than the classical ANN model. Theoretical details of ANN, ELM, and M5 model tree were described, and a case study from the Mary River catchment in Australia was provided where the use of the three soft computing techniques was compared for flood forecasting. Results of the case study indicated that the performances of the ELM and M5 model tree were better than that of the ANN model at 1- and 5-h lead times. The flood forecasts at a 10-h lead time revealed very similar performances of the three models. There is a need to further explore more advanced soft computing techniques to help improve time series forecasting in general, and flood forecasting in particular, to help formulate better awareness, prevention, and evacuation planning and management strategies.

References Abdullah, S.S., Malek, M.A., Abdullah, N.S., Kisi, O., Yap, K.S., 2015. Extreme Learning Machines: a new approach for prediction of reference evapotranspiration. J. Hydrol. 527, 184e195. https://doi.org/10.1016/j.jhydrol.2015.04.073. Abiodun, O.I., Jantan, A., Omolara, A.E., Dada, K.V., Mohamed, N.A.E., Arshad, H., 2018. Stateof-the-art in artificial neural network applications: a survey. Heliyon 4, 1e41. https://doi.org/ 10.1016/j.heliyon.2018.e00938 Abrahart, R.J., Anctil, F., Coulibaly, P., Dawson, C.W., Mount, N.J., See, L.M., Shamseldin, A.Y., Solomatine, D.P., Toth, E., Wilby, R.L., 2012. Two decades of anarchy? emerging themes and outstanding challenges for neural network river forecasting. Prog. Phys. Geogr. 36, 480e513. https://doi.org/10.1177/0309133312444943. Acharya, N., Shrivastava, N.A., Panigrahi, B.K., Mohanty, U.C., 2014. Development of an artificial neural network based multi-model ensemble to estimate the northeast monsoon rainfall over south peninsular India: an application of extreme learning machine. Clim. Dynam. 43, 1303e1310. https://doi.org/10.1007/s00382-013-1942-2. Adamala, S., Raghuwanshi, N.S., Mishra, A., Tiwari, M.K., 2014a. Evapotranspiration modeling using second-order neural networks. J. Hydrol. Eng. ASCE 19, 1131e1140. https://doi.org/10. 1061/(ASCE)HE.1943-5584.0000887. Adamala, S., Raghuwanshi, N.S., Mishra, A., Tiwari, M.K., 2014b. Development of generalized higher-order synaptic neuralebased ETo models for different agroecological regions in India. J. Irrigat. Drain. Eng. 140, 04014038. https://doi.org/10.1061/(ASCE)IR.1943-4774.0000784. Adamala, S., Raghuwanshi, N.S., Mishra, A., Tiwari, M.K., 2015. Closure to “evapotranspiration modeling using second-order neural networks” by Sirisha Adamala, N. S. Raghuwanshi, Ashok Mishra, and Mukesh K. Tiwari. J. Hydrol. Eng. 20, 07015015. https://doi.org/10.1061/ (ASCE)HE.1943-5584.0001207.

Short-term flood forecasting using artificial Chapter | 10

277

Adamowski, J., Fung Chan, H., Prasher, S.O., Ozga-Zielinski, B., Sliusarieva, A., 2012. Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting in Montreal, Canada. Water Resour. Res. 48, W01528. https://doi.org/10. 1061/(ASCE)HE.1943-5584.0001207. Adnan, R.M., Yuan, X., Kisi, O., Adnan, M., Mehmood, A., 2018. Stream flow forecasting of poorly gauged mountainous watershed by least square support vector machine, fuzzy genetic algorithm and M5 model tree using climatic data from nearby station. Water Resour. Manag. 32, 4469e4486. https://doi.org/10.1007/s11269-018-2033-2. Akusok, A., Bjork, K.M., Miche, Y., Lendasse, A., 2015. High-performance extreme learning machines: a complete toolbox for big data applications. IEEE Access 3, 1011e1025. https:// doi.org/10.1109/ACCESS.2015.2450498. Atiquzzaman, M., Kandasamy, J., 2016. Prediction of hydrological time-series using extreme learning machine. J. Hydroinf. 18, 345e353. https://doi.org/10.2166/hydro.2015.020. Bhattacharya, B., Solomatine, D.P., 2005. Neural networks and M5 model trees in modelling water level-discharge relationship. Neurocomputing 63, 381e396. https://doi.org/10.1016/j.neucom. 2004.04.016. Boucher, M.A., Quilty, J., Adamowski, J., 2020. Data assimilation for streamflow forecasting using extreme learning machines and multilayer perceptrons. Water Resour. Res. 56 e2019WR026226. https://doi.org/10.1029/2019WR026226. Brodie, I.M., Khan, S., 2016. A direct analysis of flood interval probability using approximately 100-year streamflow datasets. Hydrol. Sci. J. 61, 2213e2225. https://doi.org/10.1080/ 02626667.2015.1099790. Cook, T.R., 2020. Neural networks. In: Advanced Studies in Theoretical and Applied Econometrics, pp. 161e189. https://doi.org/10.1007/978-3-030-31150-6_6. Coulibaly, P., Anctil, F., Bobe´e, B., 2000. Daily reservoir inflow forecasting using artificial neural networks with stopped training approach. J. Hydrol. 230, 244e257. https://doi.org/10.1016/ S0022-1694(00)00214-6. Deo, R.C., Sahin, M., 2015. Application of the Artificial Neural Network model for prediction of monthly Standardized Precipitation and Evapotranspiration Index using hydrometeorological parameters and climate indices in eastern Australia. Atmos. Res. 161e162, 65e81. https://doi. org/10.1016/j.atmosres.2015.03.018. Deo, R.C., Sahin, M., 2015. Application of the extreme learning machine algorithm for the prediction of monthly Effective Drought Index in eastern Australia. Atmos. Res. 153, 512e525. https://doi.org/10.1016/j.atmosres.2014.10.016. Deo, R.C., Sahin, M., 2016. An extreme learning machine model for the simulation of monthly mean streamflow water level in eastern Queensland. Environ. Monit. Assess. 188, 1e24. https://doi.org/10.1007/s10661-016-5094-9. Deo, R.C., Tiwari, M.K., Adamowski, J.F., Quilty, J.M., 2017. Forecasting effective drought index using a wavelet extreme learning machine (W-ELM) model. Stoch. Environ. Res. Risk Assess. 31, 1211e1240. https://doi.org/10.1007/s00477-016-1265-z. Haykin, S., 2010. Neural Networks and Learning Machines, third edition. Pearson Prentice Hall, New Jersey USA, p. 944. Hsu, K.-l, Gupta, H.V., Sorooshian, S., 1995. Artificial neural network modeling of the rainfallrunoff process. Water Resour. Res. 31, 2517e2530. https://doi.org/10.1029/95WR01955. Huang, G.B., 2003. Learning capability and storage capacity of two-hidden-layer feedforward networks. IEEE Trans. Neural Network. 14, 274e281. https://doi.org/10.1109/TNN.2003. 809401.

278 Advances in Streamflow Forecasting Huang, G.B., Zhou, H., Ding, X., Zhang, R., 2012. Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. B Cybern. 42, 513e529. https://doi. org/10.1109/TSMCB.2011.2168604. Huang, G.B., Zhu, Q.Y., Siew, C.K., 2006. Extreme learning machine: theory and applications. Neurocomputing 70, 489e501. https://doi.org/10.1016/j.neucom.2005.12.126. Huang, G., Huang, G.B., Song, S., You, K., 2015. Trends in Extreme Learning Machines: A Review. Neural Networks 61, 32e48. https://doi.org/10.1016/j.neunet.2014.10.001. ¨ ., 2004. Multi-layer perceptrons with Levenberg-Marquardt training algorithm for susKis¸i, O pended sediment concentration prediction and estimation. Hydrol. Sci. J. 49, 1025e1040. https://doi.org/10.1623/hysj.49.6.1025.55720. Koradia, A.K., Bhalala, A.D., Tiwari, M.K., 2019. Rainfall-runoff Simulation Modelling Using Artificial Neural Networks in Semi-arid Middle Gujarat Region, 47(3), 231e238. Kumar, S., Tiwari, M.K., Chatterjee, C., Mishra, A., 2015. Reservoir inflow forecasting using ensemble models based on neural networks, wavelet analysis and bootstrap method. Water Resour. Manag. 29, 4863e4883. https://doi.org/10.1007/s11269-015-1095-7. Li, B.J., Cheng, C.T., 2014. Monthly discharge forecasting using wavelet neural networks with extreme learning machine. Sci. China Technol. Sci. 57, 2441e2452. https://doi.org/10.1007/ s11431-014-5712-0. Maier, H.R., Dandy, G.C., 2000. Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environ. Model. Software 15, 101e124. https://doi.org/10.1016/S1364-8152(99)00007-9. Makwana, J.J., Tiwari, M.K., 2014. Intermittent streamflow forecasting and extreme event modelling using wavelet based artificial neural networks. Water Resour. Manag. 28, 4857e4873. https://doi.org/10.1007/s11269-014-0781-1. Mouatadid, S., Adamowski, J., 2017. Using extreme learning machines for short-term urban water demand forecasting. Urban Water J. 14, 630e638. https://doi.org/10.1080/1573062X. 2016.1236133. Nourani, V., Hosseini Baghanam, A., Adamowski, J., Kisi, O., 2014. Applications of hybrid wavelet-Artificial Intelligence models in hydrology: a review. J. Hydrol. 514, 358e377. https://doi.org/10.1016/j.jhydrol.2014.03.057. Nurhayati, Soekarno, I., Hadihardaja, I.K., Cahyono, M., 2015. A study of hold-out and k-fold cross validation for accuracy of groundwater modeling in tidal lowland reclamation using extreme learning machine. In: Proceedings of 2014 2nd International Conference on Technology, Informatics, Management, Engineering and Environment. TIME-E, pp. 228e233, 2014. https://doi.org/10.1109/TIME-E.2014.7011623. Onyari, E.K., Ilunga, F.M., 2013. Application of MLP neural network and M5P model tree in predicting streamflow: a case study of Luvuvhu catchment, South Africa. Int. J. Innov. Manag. Technol. 4, 11. https://doi.org/10.7763/IJIMT.2013.V4.347. Paudel, M., Nelson, E.J., Downer, C.W., Hotchkiss, R., 2011. Comparing the capability of distributed and lumped hydrologic models for analyzing the effects of land use change. J. Hydroinf. 13, 461e473. https://doi.org/10.2166/hydro.2010.100. Prasad, R., Deo, R.C., Li, Y., Maraseni, T., 2017. Input selection and performance optimization of ANN-based streamflow forecasts in the drought-prone Murray Darling Basin region using IIS and MODWT algorithm. Atmos. Res. 197, 42e63. https://doi.org/10.1016/j.atmosres. 2017.06.014. Prasad, R., Deo, R.C., Li, Y., Maraseni, T., 2019. Weekly soil moisture forecasting with multivariate sequential, ensemble empirical mode decomposition and Boruta-random forest hybridizer algorithm approach. Catena 177, 149e166. https://doi.org/10.1016/j.catena.2019.02.012.

Short-term flood forecasting using artificial Chapter | 10

279

Quilty, J., Adamowski, J., Khalil, B., Rathinasamy, M., 2016. Bootstrap rank-ordered conditional mutual information (broCMI): a nonlinear input variable selection method for water resources modeling. Water Resour. Res. 52, 2299e2326. https://doi.org/10.1002/2015WR016959. Quinlan, J.R., 1992. Learning with continuous classes. In: Australian Joint Conference on Artificial Intelligence, pp. 343e348. Ripley, B.D., 2007. Pattern Recognition and Neural Networks. Cambridge University Press, p. 403. Sehgal, V., Tiwari, M.K., Chatterjee, C., 2014. Wavelet bootstrap multiple linear regression based hybrid modeling for daily River Discharge forecasting. Water Resour. Manag. 28, 2793e2811.  Stravs, L., Brilly, M., 2007. Development of a low-flow forecasting model using the M5 machine learning method. Hydrol. Sci. J. 52, 466e477. https://doi.org/10.1623/hysj.52.3.466. Taghi Sattari, M., Pal, M., Apaydin, H., Ozturk, F., 2013. M5 model tree application in daily river flow forecasting in Sohu Stream, Turkey. Water Resour. 40, 233e242. https://doi.org/10.1134/ S0097807813030123. Taormina, R., Chau, K.W., Sivakumar, B., 2015. Neural network river forecasting through baseflow separation and binary-coded swarm optimization. J. Hydrol. 529, 1788e1797. https://doi. org/10.1016/j.jhydrol.2015.08.008. The Mathworks Inc, 2018. MATLAB 2018. Www.Mathworks.Com/Products/Matlab. Tiwari, M.K., Adamowski, J., 2013. Urban water demand forecasting and uncertainty assessment using ensemble wavelet-bootstrap-neural network models. Water Resour. Res. 49, 6486e6507. https://doi.org/10.1002/wrcr.20517. Tiwari, M.K., Adamowski, J.F., 2015. Medium-term urban water demand forecasting with limited data using an ensemble waveletebootstrap machine-learning approach. J. Water Resour. Plann. Manag. ASCE 141, 04014053. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000454. Tiwari, M.K., Adamowski, J.F., 2017. An ensemble wavelet bootstrap machine learning approach to water demand forecasting: a case study in the city of Calgary. Canada. Urban Water J. 14, 185e201. https://doi.org/10.1080/1573062X.2015.1084011. Tiwari, M.K., Chatterjee, C., 2010. Uncertainty assessment and ensemble flood forecasting using bootstrap based artificial neural networks (BANNs). J. Hydrol. 382, 20e33. https://doi.org/10. 1016/j.jhydrol.2009.12.013. Tiwari, M.K., Song, K.Y., Chatterjee, C., Gupta, M.M., 2013. Improving reliability of river flow forecasting using neural networks, wavelets and self-organising maps. J. Hydroinf. 15, 486e502. https://doi.org/10.2166/hydro.2012.130. Verma, A.K., Jha, M.K., Mahana, R.K., 2010. Evaluation of HEC-HMS and WEPP for simulating watershed runoff using remote sensing and geographical information system. Paddy Water Environ. 8, 131e144. https://doi.org/10.1007/s10333-009-0192-8. Waters, D., 2014. Modelling reductions of pollutant loads due to improved management practices in the Great Barrier Reef catchments, Whole of GBR. Tech. Report 1. Wu, L., Zhou, H., Ma, X., Fan, J., Zhang, F., 2019. Daily reference evapotranspiration prediction based on hybridized extreme learning machine model with bio-inspired optimization algorithms: application in contrasting climates of China. J. Hydrol. 577, 123960. https://doi.org/10. 1016/j.jhydrol.2019.123960. Yin, J.C., Li, G.S., Hu, J.Q., 2014. A modular prediction mechanism based on sequential extreme learning machine with application to real time tidal prediction. In: Sun, F., Toh, K.A., Romay, M., Mao, K. (eds) Extreme learning machines 2013: Algorithms and applications. Adaptation, learning, and optimization, vol. 16. Springer, Cham. pp. 35e53. https://doi.org/10. 1007/978-3-319-04741-6_4.

Chapter 11

A new heuristic model for monthly streamflow forecasting: outlier-robust extreme learning machine ¨ zgur Kis¸i2 Salim Heddam1, O 1 Faculty of Science, Agronomy Department, Hydraulics Division, Laboratory of Research in Biodiversity Interaction Ecosystem and Biotechnology, University 20 Aouˆt 1955, Skikda State, Algeria; 2Department of Civil Engineering, School of Technology, Ilia State University, Tbilisi, Georgia

11.1 Introduction Streamflow plays an important role in the planning and management of water resources (Gibbs et al., 2018; Mazrooei and Sankarasubramanian, 2019; Choubin et al., 2019). In the past few decades, the availability of streamflow time series at various timescales has significantly contributed to improving the quality of research’s results, especially in the areas of hydrology, hydraulics, and hydrometeorology (Yaseen et al., 2015). Therefore, it is of significant importance to understand the variation of the streamflow over the time and space (Mihailovic et al., 2019), and it proves to be a promising field of research (Ciria et al., 2019). As a consequence, a significant amount of work has been done to apply several kinds of forecasting models aiming at rapidly providing an estimation of the streamflow at a particular site (Yaseen et al., 2019a). Indeed, it is reported in literature that data-driven machines learning (DD-ML) approaches are able to provide high-quality information about streamflow if they are fully and accurately applied. Thus, forecasting streamflow using machine learning models is highly important and holds considerable promise to quantitatively and qualitatively estimating streamflow time series. The competitive advantage of DD-ML approaches over the classical regression models in capturing the high nonlinearity of streamflow time series has ranked them among the topmost common and widely used mathematical models for predicting and forecasting streamflow. This might be the reason behind the fact that the number of published research articles Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00005-6 Copyright © 2021 Elsevier Inc. All rights reserved.

281

282 Advances in Streamflow Forecasting

has significantly increased over the years. Increased number of studies also reflects the importance attributed to the knowledge of streamflow variations over the time. A variety of the DD-ML models, i.e., enhanced extreme learning machine (EELM) (Yaseen et al., 2019a), adaptive neuro-fuzzy inference system (ANFIS) optimized using heuristic optimization algorithms (Yaseen et al., 2019b), and discrete wavelet transform combined with support vector regression (Fang et al., 2019), has been developed and successfully applied for predicting and forecasting streamflow at various timescales (Adnan et al., 2019; Mehdizadeh et al., 2019; Freire et al., 2019; Fouchal and Souag-Gamane, 2019; Tikhamarine et al., 2019). As the DD-ML methods are often suffering from overfitting, and also time-consuming, a suite of robust techniques has been incorporated with and without the regular DD-ML models for improving their accuracy. For this purpose, recent studies with timeefrequency analysis of signals, such as discrete wavelet transform (DWT) (Labat, 2005; Labat et al., 2005), the empirical mode decomposition (EMD) (Huang et al., 1998), and the Hilbert transform (HT) combined with DD-ML (Hu et al., 2020; Adarsh and Reddy, 2016), have contributed actively to the progress of researches and have been helping to significantly improve the models’ accuracy and to reduce the frequency of errors specifically caused by the outlier and the extreme values of streamflow, by decomposing the original signal into several subsignals, and leading to significantly capture the nonlinearity of the streamflow signal (Yaseen et al., 2019a,b; Fathian et al., 2019; Luo et al., 2019; Meng et al., 2019; Fang et al., 2019; Zhou et al., 2019; Rezaie-Balf et al., 2019; Al-Sudani et al., 2019; Wang et al., 2019a; Nanda et al., 2019; Adnan et al., 2019; Mehdizadeh et al., 2019; Freire et al., 2019; Fouchal and Souag-Gamane, 2019; Zamoum and Souag-Gamane, 2019; Tikhamarine et al., 2019). Yaseen et al. (2019a) presented an enhanced extreme learning machine (EELM) model for forecasting river flow in tropical regions of Guillemard station, Kelantan River located in the north-east region of the Malaysian Peninsula. The EELM was tuned using the complete orthogonal decomposition method instead to the standard MooreePenrose generalized inverse method. Results obtained using the EELM were compared to those obtained using the original ELM and the support vector regression (SVR) models. Taking into account the autocorrelation function (AFC) and the partial autocorrelation function (PACF), the study adopted three input variables namely, the river flow (Q) at time t  1, t  2, and t  3 to forecast the Q at time t. They observed a better relationship between measured and forecasted Q using the EELM compared to the river flows predicted using the ELM and SVR models. Numerical comparison of the results between the models revealed high value of correlation coefficient (R) as 0.894, NasheSutcliffe efficiency (NSE) as 0.799, and Willmott’s Index (d) as 0.938 for the EELM model as compared to those obtained using the ELM (R ¼ 0.869, NSE ¼ 0.743, and d ¼ 0.918)

A new heuristic model for monthly Chapter | 11

283

and SVR (R ¼ 0.818, NSE ¼ 0.665, and d ¼ 0.892). Yaseen et al. (2019b) employed the ANFIS hybridized using three optimization algorithms namely particle swarm optimization (PSO), genetic algorithm (GA), and differential evolution (DE), for forecasting streamflow in the Pahang River located in the central region of Peninsular Malaysia. Based on the obtained results, it is reported that the PSO algorithm was more suitable than the others and significantly improved the performance of the models, and the ANFIS-PSO was the best forecasting model with an R2 value of 0.998. Fathian et al. (2019) proposed an innovative approach called the self-exciting threshold autoregressive (SETAR) models for forecasting daily streamflow at the Zarrineh Rood River in the southern part of Lake Urmia, in northwestern Iran. The results indicated the superiority of the generalized autoregressive conditional heteroscedasticity (SETAR-GARCH) compared to the SETAR model. Luo et al. (2019) employed a hybrid model integrating four approaches for forecasting monthly streamflow: (i) factor analysis, (ii) time series decomposition, (iii) data regression, and (iv) error suppression. The four approaches were hybridized for improving the accuracy of the SVR model and generalized regression neural network (GRNN) model. It was found that the seasonal trend decomposition procedure significantly improved the accuracy of the SVR model (R2 value of 0.90) against the accuracy (R2 value of 0.81) obtained using the GRNN model. The modified EMD was incorporated with the SVR model to improve the forecasting accuracy of the monthly streamflow at the Wei River, China (Meng et al., 2019). The hybrid model was then called M-EMDSVR, and forecasts of the M-EMDSVR were compared with that of the standard SVR and ANN models. The integrated M-EMDSVR model was found to be the more accurate than the individual SVR, ANN, and EMD-SVM models, with NSE of 0.995 compared to 0.526, 0.612, and 0.970 archived using the ANN, SVM, and the EMD-SVM models, respectively. Using data collected at four stations in the Wei River basin, China, Fang et al. (2019) compared the DWT combined with the SVR model (DWT-SVR) and variational mode decomposition (VMD) combined with the SVR model (VMD-SVR) for forecasting streamflow. The results demonstrated that the VMD-SVR model was more accurate in forecasting streamflow than the other models. Adnan et al. (2020) compared between optimally pruned extreme learning machine (OPELM), least square support vector machine (LSSVM), multivariate adaptive regression spline (MARS), and M5 model tree (M5 Tree), in predicting monthly streamflow at the Swat River basin, Hindu Kush Mountains, Pakistan. In addition to the salient DD-ML models whose application in streamflow forecasting is briefly discussed here, there exist several other models that are successfully used for streamflow forecasting as reported in the literature. A few of the models among them are the recurrent adaptive networkebased

284 Advances in Streamflow Forecasting

fuzzy inference system (R-ANFIS) embedded with GA and least square estimator (GL) (Zhou et al., 2019); the ensemble empirical mode decomposition (EEMD) combined with M5 model tree (M5Tree) and MARS models (Rezaie-Balf et al., 2019); least square support vector regression (LSSVR), and MARS integrated with differential evolution (MARS-DE) models (Al-Sudani et al., 2019); improved complete eemd with adaptive noise (ICEEMDAN) and extreme learning machine (ELM) approaches (Wang et al., 2019a,b); dynamic wavelet-based nonlinear autoregressive with exogenous inputs (WNARXu) and nonlinear static wavelet-based neural network (WNNu) (Nanda et al., 2019); optimally pruned extreme learning machine (OP-ELM) (Adnan et al., 2019); gene expression programming with fractionally autoregressive integrated moving average (GEP-FARIMA), MARS-FARIMA, multiple linear regression MLR-FARIMA, and GEP with self-exciting threshold autoregressive (SETAR) time series GEP-SETAR, MARS-SETAR, and MLR-SETAR (Mehdizadeh et al., 2019); wavelet-artificial neural network (ANN) (Freire et al., 2019; Fouchal and Souag-Gamane, 2019); nonparametric K-nearest neighbor regression (Poul et al., 2019); hybrid wavelet support vector regression based on gray wolf optimizer (Tikhamarine et al., 2019), and water balanceebased rainfall-runoff model (GR2M) integrated with self-organization maps (Zamoum and Souag-Gamane, 2019; Adnan et al., 2020; Abbasi et al., 2020). This chapter introduces, for the first time, three new heuristic models, i.e., (i) outlier-robust extreme learning machine (ORELM), (ii) regularized extreme learning machine (RELM), and (iii) weighted regularized extreme learning machine (WRELM), which are modified and improved forms of the original ELM model. First, the chapter provides an overview of the original as well as three modified ELM models. Then, the chapter presents a case study where application of the four heuristic ELM models (one original and three improved versions) is demonstrated in forecasting monthly streamflow using data from two stations, Topluca and Tozkoy, located in Turkey. The case study further compares the performance of the four ELM models with performance of the standard multiple linear regression (MLR) model in forecasting monthly streamflow. The monthly streamflow (Q: m3/s) at time t is forecasted by comparing several input combinations of the Q at the previous time steps (e.g., t  1, t  2 .).

11.2 Overview of extreme learning machine and multiple linear regression 11.2.1 Extreme learning machine model and its extensions The feedforward artificial neural network with single hidden layer called SLFN is the most well-known kind of ANN models and possesses the universal approximation capability (Hornik et al., 1989). Training of the SLFN is

A new heuristic model for monthly Chapter | 11

285

the most important step toward the success of the ANN models, and a lot of important work has been done to improve the learning process of the SLFN model. Huang et al. (2006a,b) introduced an innovative algorithm for training the SLFN called ELM. Using the standard backpropagation training algorithm, the training process is achieved by updating all the weights and biases between the input and the hidden layer and between the hidden and the output layer. The innovative ELM approach proposed by Huang et al. (2006a,b) divides the training process into two stages: the weights and bias between input and hidden neurons are randomly generated (aij and qj), and the output weights (bj) are determined analytically by MooreePenrose generalized inverse matrix, resulting in a significant increase in performance and decrease in training times especially when high volumes of data are involved (Zhang and Luo, 2015; Cheng et al., 2019). For the first stage, the ELM training algorithm is described as follows (Fig. 11.1): for N arbitrary data (xi, yi), where xi ¼ [xi1, xi2, ., xin]T ˛ Rn corresponds to the input variables and yi ¼ [yi1, yi2, ., yim]T ˛ Rm corresponds to the output variable, the output function of the SLFN model with L hidden neuron can be expressed as follows (Huang et al., 2006a,b): fL ðxi Þ ¼

L X i¼1

bj gðxi Þ ¼

L X

  bj g aij  xi þ qj ¼ yi ði ¼ 1; .; NÞ

(11.1)

i¼1

In Eq. (11.1), aij and qj are the weight and biases vectors of the hidden nodes, bj is the output weight between hidden and output layer, g is the activation function, and fL (xi) is the mathematical expression of the SLFN. According to Cheng et al. (2019), for the second stage, the outputs (bj) are solved by minimizing the approximation error (i.e., the difference between target and computed values) as follows:   (11.2) min kHb  Uk22 b

FIGURE 11.1 Flowchart of the extreme learning machine models.

286 Advances in Streamflow Forecasting

According to Huang et al. (2006a,b), H is the hidden layer output matrix of the SLFN model (Xu et al., 2016), where H and Y are defined as follows: 2 3 h1 ða1  x1 þ q1 Þ / hL ðaL  x1 þ qL Þ 6 7 7 « « « (11.3) H ¼6 4 5 h1 ða1  xN þ q1 Þ / hL ðaL  xN þ qL Þ NL

2

3

yT1 6 7 U¼6 4 « 5 yTN The above equations can be formulated as follows:  1 b ¼ H y Y and H y ¼ H T H H T

(11.4)

(11.5)

where H y is the MooreePenrose generalized inverse of the matrix H (Yang et al., 2019). In several studies, it is highlighted that the ELM model suffers from the overfitting problem, as the ELM model is mainly based on the principle of “empirical risk minimization (ERM)” (Xu et al., 2016; Peng et al., 2015; Wang et al., 2019a,b). Hence, in order to avoid the overfitting problem, the model should make use of the “structural risk minimization (SRM)” principle instead of the ERM. Adoption of the SRM principle improves the generalization and robustness and avoids the local minima, which otherwise decreases the performance of the models and causes the overfitting problem (Zhang and Luo, 2015; Peng et al., 2015; Xu et al., 2016; Wang et al., 2019a,b; Yang et al., 2019; Cheng et al., 2019). Consequently, an improved version of the ELM model called RELM model was introduced by adding a regulation term called C (Cao et al., 2018a,b; Wang et al., 2019b) and the solution for the outputs (b) is achieved as follows: 2 1 1 kHb  Uk þ kbk22 2 2 2

(11.6)

2 1 1 kek þ kbk22 ; .; e ¼ Hb  U 2 2 2

(11.7)

minC b

minC b

The regulation term C controls the “trade-off” between the training error and output weight norm (i.e., the ratio between the error and the output weight), and e is the training error calculated between target and estimated values (Zhang and Luo, 2015; Cao et al., 2018a,b; Deng et al., 2009; Wang et al., 2019b). A further improved version of the RELM model was proposed by Deng et al. (2009) by introducing a weighting factor (Wi) to the model error (e), with

A new heuristic model for monthly Chapter | 11

287

the goal of decreasing the effect of large training error caused by the presence of outliers (Zhang and Luo, 2015), and the model is called WRELM model. According to Zhang and Luo (2015), the weighting factor (Wi) is calculated as follows: kek22 ¼ kW$ek22

(11.8)

for which W is the weighting factor calculated as follows (Zhang and Luo, 2015): 8 e   > 1; >  3 > > a > > e > >   > > e 3  > <   a; 2:5 <   > > 4 > 10 ; otherwise > > > > > > IQR > :a ¼ 2  0:6745 for which IQR is the interquartile range calculated as the difference between the 75th percentile and the 25th percentile (Zhang and Luo, 2015), and a is robust estimate of the standard deviation of the RELM error. Later, the RELM was improved enough in order to reduce the effect of larger errors caused by the presence of outliers and a new version called ORELM model was proposed, leading to a significant improvement of the ELM model by reducing simultaneously the error e and the norm of the weight using a constrained convex optimization problem (Zhang and Luo, 2015): 1 minCkek1 þ kbk22 b C

(11.10)

In this chapter, all the proposed ELM models were developed and applied using the MATLAB Toolbox provided by Zhang and Luo (2015), which is freely available at https://fr.mathworks.com/matlabcentral/fileexchange/ 50877.

11.2.2 Multiple linear regression The MLR model is the well-known statistical model used for linking a set of predictors (xi) to one dependent variable (Y ) by applying an ensemble of parameters (bi) (Heddam, 2017): Y ¼ f ðxi Þ ¼ b0 þ b1 x1 þ b2 x2 þb3 x3 þ b4x4 þ bi xi The parameter bi is determined using the least square method.

(11.11)

288 Advances in Streamflow Forecasting

11.3 A case study of forecasting streamflows using extreme machine learning models 11.3.1 Study area In this chapter, the ML models were implemented utilizing monthly streamflow data from two stations, Tozkoy and Topluca, Black Sea Region (BSR) of Turkey. The locations of the stations are illustrated in Fig. 11.2. Both stations have streamflow data covering period from 1964 to 2007 without any data gaps. The coast of the BSR gets the highest precipitation amount; eastern part annually gets precipitation of 2200 mm. This region has a wet and humid climate with the mean air temperature of about 22 C in summers and 4 C in winters. The average annual precipitation is 842 mm, and about 19.4% of this amount is dropped in summer season (June, July, August). Relative humidity in the study area is very high with annual value of 71% (Sensoy et al., 2008).

FIGURE 11.2 Geographical locations of the studied stations (Tozkoy, 2233; Topluca, 2232) in Black Sea Region of Turkey.

A new heuristic model for monthly Chapter | 11

289

Data of both the stations were obtained from the State Water Work of Turkey. Eastern BSR receives the average precipitation of 198 mm annually and its river basin area is 24,077 km2 (Kahya and Kalayci, 2004). Basic statistics of the data is given in the next section.

11.4 Applications and results The basic statistical characteristics, i.e., mean, minimum, maximum, standard deviation, and coefficient of variation of the streamflows for the whole datasets as well as model testing and validation datasets at two stations, are summarized in Table 11.1. It can be seen that the streamflow data of Tozkoy Station have the higher variation than those of the Topluca Station as revealed from the values of the coefficient of variation. The other important information derived from Table 11.1 is that training data extremes do not cover the extremes of validation data at both the stations. The maximum streamflow values are higher in validation dataset compared to that in training dataset. This may cause slight difficulties for the implemented models in extrapolating data beyond the training range as reported by Kisi and Aytek (2013) and Kisi and Parmar (2016). Four ELM models were implemented for monthly streamflow forecasting and forecasts were compared with that of MLR. The four statistical metrics, i.e., correlation coefficient (R2), NSE, root mean square error (RMSE), and mean absolute error (MAE), were applied for evaluating accuracy of the models: 3 2   1 P 7 6 QiO  QiO QiM  QiM 7 6 N sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi7 R2 ¼ 6sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (11.12) 2 1 P 2 5 4 1 P n  n  Q  QiO Q  QiM N i¼1 iO N i¼1 iM N P

½QiO  QiM  2 NSE ¼ 1 

2 N P QiO  QiO i¼1

(11.13)

i¼1

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N u1 X ðQ  QiM Þ2 RMSE ¼ t N i¼1 iO MAE ¼

N 1 X jQ  QiM j N i¼1 iO

(11.14)

(11.15)

where N ¼ number of data, QiO ¼ mean of observed streamflow, QiP ¼ mean of modeled streamflow, QiM ¼ modeled streamflow, and QiO ¼ observed streamflow.

Topluca station

Tozkoy station

Whole data

Training data

Validation data

Whole data

Training data

Validation data

Mean

28.964

28.655

30.233

6.627

6.447

7.140

Maximum

125.0

99.3

125.0

36.9

31.4

36.9

Minimum

6.8

6.8

8.5

1.0

1.0

1.3

Standard deviation

20.650

19.742

21.668

7.190

6.889

7.677

Coefficient of variation

0.713

0.689

0.717

1.085

1.069

1.075

290 Advances in Streamflow Forecasting

TABLE 11.1 The statistical parameters of the streamflow data (Q: m3/s) for the two stations.

A new heuristic model for monthly Chapter | 11

291

Various input scenarios involving previous streamflow values were considered for each of the four heuristic ELM models. A summary of these input scenarios together with models’ labels is provided in Table 11.2. The optimal input lags were decided based on the results of autocorrelation and partial autocorrelation analyses as illustrated in Fig. 11.3. It is evident that after a time lag of 12 months, the streamflow data have insignificant partial autocorrelation values. TABLE 11.2 The input combinations of different models. ORELM

ELM

RELM

WRELM

MLR

Input combination

ORELM1

ELM1

RELM1

WRELM1

MLR1

Q(t  1)

ORELM2

ELM2

RELM2

WRELM2

MLR2

Q(t  1), Q(t  2)

ORELM3

ELM3

RELM3

WRELM3

MLR3

Q(t  1), Q(t  2), Q(t  5)

ORELM4

ELM4

RELM4

WRELM4

MLR4

Q(t  1), Q(t  2), Q(t  5), Q(t  6)

ORELM5

ELM5

RELM5

WRELM5

MLR5

Q(t  1), Q(t  2), Q(t  5), Q(t  6), Q(t  11)

ORELM6

ELM6

RELM6

WRELM6

MLR6

Q(t  1), Q(t  2), Q(t  5), Q(t  6), Q(t  11), Q(t  12)

ELM, extreme learning machine; MLR, multiple linear regression; ORELM, outlier-robust extreme learning machine; RELM, regularized extreme learning machine; WRELM, weighted regularized extreme learning machine.

FIGURE 11.3 Autocorrelation (ACF) and partial autocorrelation function (PACF) for both stations.

292 Advances in Streamflow Forecasting

Values of the performance evaluation metrics with respect to R2, NSE, RMSE, and MAE in training and validation stages for the implemented models in forecasting streamflow are shown in Table 11.3 for Topluca Station. It is seen from Table 11.3 that the sixth input scenario provides the best forecasting accuracy for all the employed models in both the training and validation stages. Furthermore, the ELM models seem to be more successful in simulating and forecasting streamflow data as compared to the MLR model. The ELM6 model showed the best performance in model validation stage with the lowest value of RMSE (10.835 m/s) and the highest value of NSE (0.748) closely followed by the RELM6 model with the values of RMSE as 10.894 m3/s and NSE as 0.745. The ORELM6 and WRELM6 models obtained the third and fourth ranks among the topmost models precisely forecasting the streamflows just after the ELM6 and RELM6 models placed at the first two ranks. On the other hand, the MLR6 model performed poorest among all the applied models. It is evident from Table 11.3 that including time lag of 11 months as input considerably improves performance of all the models in forecasting the streamflows accurately in both training and validation stages. For example, in the validation stage, inclusion of a 11-month time lag decreases RMSE value from 15.036 to 12.354 m3/s for the MLR model, from 4.031 to 3.372 m3/s for the ELM model, from 4.102 to 3.311 m3/s for the ORELM model, from 4.043 to 3.325 m3/s for the RELM model, and from 4.127 to 3.260 m3/s for the WRELM model. Likewise, consideration of a 11-month time lag increases NSE value from 0.515 to 0.673 for the MLR model, from 0.592 to 0.719 for the ELM model, from 0.559 to 0.719 for the ORELM model, from 0.596 to 0.725 for the RELM model, and from 0.594 to 0.739 for the WRELM model. This is also confirmed from the PACF plot depicted in Fig. 11.3, in which Q(t  11) has the highest PACF value at both the stations. It is also worth mentioning that using only Q(t  1) and Q(t  2) is not sufficient for modeling the monthly streamflow in the study area. Based on the accuracy of the streamflow forecasts, the implemented models are ranked from the high to low accuracies as ELM6˃RELM6˃ORELM6˃WRELM6˃MLR6 at Topluca Station. The ELM6 model showed an improved marginal accuracy over the MLR6 model with respect to R2, NSE, RMSE, and MAE by 2.1%, 4.5%, 5.8%, and 3.2%, respectively. Values of the statistical metrics for evaluating performance of the implemented models in forecasting streamflow of Tozkoy Station are given in Table 11.4. Streamflow at this station is most accurately forecasted by all the methods with the sixth input scenario. At Tozkoy Station, the RELM6 model showed the best performance in validation stage with the lowest value of RMSE (3.089 m3/s) and MAE (1.851 m3/s) and the highest values of NSE (0.837) and R2 (0.916). The ORELM6 and ELM6 models have the third and fourth ranks among the topmost forecasting models after the RELM6 and WRELM6 models securing the first two ranks. In contrast, the MLR6 model yielded streamflow forecasts with the lowest accuracy. This finding is similar

TABLE 11.3 Performances of different models in forecasting streamflow at Topluca Station. Model training 2

Model validation 2

NSE

RMSE

MAE

R

NSE

RMSE

MAE

MLR1

0.667

0.445

14.690

11.140

0.606

0.363

17.237

11.982

MLR2

0.756

0.571

12.911

9.484

0.715

0.505

15.197

9.783

MLR3

0.769

0.592

12.596

9.434

0.723

0.518

14.997

10.185

MLR4

0.784

0.614

12.243

9.042

0.721

0.515

15.036

9.915

MLR5

0.853

0.727

10.298

7.227

0.821

0.673

12.354

7.868

MLR6

0.879

0.773

9.389

6.729

0.847

0.716

11.505

7.316

ELM1

0.692

0.479

14.225

10.819

0.609

0.364

17.224

12.146

ELM2

0.804

0.646

11.726

8.451

0.738

0.541

14.622

9.410

ELM3

0.854

0.729

10.260

7.298

0.766

0.580

13.986

9.416

ELM4

0.844

0.712

10.580

7.621

0.773

0.592

13.790

9.514

ELM5

0.883

0.780

9.256

6.642

0.849

0.719

11.439

7.711

ELM6

0.896

0.803

8.751

6.175

0.865

0.748

10.835

7.084

ORELM1

0.676

0.410

15.148

10.323

0.604

0.318

17.827

11.769

ORELM2

0.806

0.644

11.758

8.187

0.745

0.547

14.529

9.370

ORELM3

0.853

0.719

10.458

7.101

0.776

0.588

13.852

9.095

ORELM4

0.834

0.685

11.058

7.639

0.754

0.559

14.335

9.285 Continued

293

R

A new heuristic model for monthly Chapter | 11

Models

Model training Models

R2

NSE

ORELM5

0.886

0.782

9.201

ORELM6

0.895

0.797

RELM1

0.679

RELM2

Model validation R2

NSE

RMSE

6.180

0.855

0.719

11.439

7.449

8.873

5.968

0.858

0.735

11.121

7.365

0.461

14.476

10.947

0.608

0.367

17.176

12.170

0.794

0.631

11.978

8.656

0.740

0.543

14.597

9.729

RELM3

0.856

0.733

10.187

7.258

0.756

0.567

14.213

9.744

RELM4

0.836

0.700

10.805

7.842

0.773

0.596

13.723

9.255

RELM5

0.896

0.803

8.746

6.162

0.852

0.725

11.325

7.508

RELM6

0.890

0.792

8.998

6.356

0.864

0.745

10.894

6.921

WRELM1

0.672

0.446

14.671

10.778

0.608

0.365

17.199

11.846

WRELM2

0.797

0.633

11.939

8.429

0.745

0.550

14.483

9.334

WRELM3

0.848

0.717

10.483

7.443

0.764

0.576

14.063

9.385

WRELM4

0.841

0.706

10.686

7.462

0.780

0.594

13.760

9.255

WRELM5

0.882

0.775

9.359

6.471

0.863

0.739

11.030

7.318

WRELM6

0.897

0.804

8.739

6.100

0.857

0.731

11.201

7.291

RMSE

MAE

MAE

MAE, mean absolute error; MLR1, multiple linear regression model corresponding to combination 1 (see Table 11.2 for all combinations); NSE, NasheSutcliffe efficiency; R2, correlation coefficient; RMSE, root mean square error.

294 Advances in Streamflow Forecasting

TABLE 11.3 Performances of different models in forecasting streamflow at Topluca Station.dcont’d

A new heuristic model for monthly Chapter | 11

295

TABLE 11.4 Performances of different models in forecasting streamflow at Tozkoy Station. Training Models

R

MLR1

2

Validation 2

NSE

RMSE

MAE

R

NSE

RMSE

MAE

0.671

0.450

5.100

3.611

0.611

0.368

6.080

4.012

MLR2

0.705

0.497

4.877

3.597

0.641

0.404

5.904

4.119

MLR3

0.705

0.498

4.877

3.598

0.641

0.404

5.903

4.124

MLR4

0.730

0.533

4.699

3.433

0.653

0.416

5.845

4.107

MLR5

0.878

0.770

3.298

2.108

0.832

0.692

4.243

2.531

MLR6

0.917

0.841

2.747

1.645

0.890

0.793

3.484

2.049

ELM1

0.708

0.501

4.861

3.418

0.618

0.372

6.061

4.043

ELM2

0.896

0.803

3.054

2.059

0.801

0.630

4.653

2.799

ELM3

0.939

0.882

2.368

1.620

0.846

0.707

4.140

2.648

ELM4

0.941

0.885

2.331

1.570

0.854

0.722

4.031

2.639

ELM5

0.940

0.884

2.340

1.440

0.899

0.806

3.372

2.060

ELM6

0.946

0.896

2.222

1.360

0.911

0.829

3.159

1.988

ORELM1

0.674

0.362

5.498

3.103

0.615

0.291

6.439

3.525

ORELM2

0.899

0.802

3.061

1.976

0.812

0.650

4.525

2.682

ORELM3

0.939

0.881

2.375

1.433

0.852

0.722

4.033

2.358

ORELM4

0.930

0.861

2.564

1.508

0.851

0.713

4.102

2.303

ORELM5

0.950

0.901

2.165

1.078

0.903

0.813

3.311

1.872

ORELM6

0.943

0.889

2.290

1.190

0.911

0.829

3.164

1.746

RELM1

0.700

0.491

4.911

3.480

0.626

0.389

5.979

4.049

RELM2

0.901

0.811

2.990

2.040

0.805

0.639

4.596

2.803

RELM3

0.938

0.880

2.383

1.595

0.852

0.720

4.048

2.525

RELM4

0.935

0.875

2.433

1.635

0.853

0.721

4.043

2.659

RELM5

0.940

0.884

2.340

1.451

0.901

0.811

3.325

2.000

RELM6

0.950

0.903

2.141

1.264

0.916

0.837

3.089

1.851

WRELM1

0.676

0.429

5.201

3.235

0.615

0.353

6.154

3.677

WRELM2

0.897

0.803

3.058

2.016

0.808

0.645

4.557

2.686

WRELM3

0.926

0.857

2.605

1.572

0.845

0.705

4.152

2.297 Continued

296 Advances in Streamflow Forecasting

TABLE 11.4 Performances of different models in forecasting streamflow at Tozkoy Station.dcont’d Training Models

R

WRELM4

2

Validation

NSE

RMSE

MAE

R

0.935

0.874

2.441

1.542

WRELM5

0.940

0.884

2.346

WRELM6

0.950

0.902

2.149

2

NSE

RMSE

MAE

0.847

0.709

4.127

2.459

1.274

0.906

0.818

3.260

1.719

1.157

0.915

0.836

3.098

1.808

MAE, mean absolute error; MLR1, multiple linear regression model corresponding to combination 1 (see Table 11.2 for all combinations); NSE, NasheSutcliffe efficiency; R2, correlation coefficient; RMSE, root mean square error.

for both stations of this study. Similar to Topluca Station, inclusion of streamflow at 11-month time lag Q(t  11) as input variable for Tozkoy Station considerably improves models’ accuracies in forecasting streamflow. For example, comparison of the fourth and fifth input scenarios reveals a decrease in RMSE from 5.845 to 4.243 m3/s for the MLR model, from 4.031 to 3.372 m3/s for the ELM model, from 4.102 to 3.311 m3/s for the ORELM model, from 4.043 to 3.325 m3/s for the RELM model, and from 4.127 to 3.260 m3/s for the WRELM model. Similarly, consideration of Q(t  11) improves NSE from 0.416 to 0.692 for the MLR model, from 0.722 to 0.806 for the ELM model, from 0.713 to 0.813 for the ORELM model, from 0.721 to 0.811 for the RELM model, and from 0.709 to 0.818 for the WRELM model. At Tozkoy Station, the implemented model may be ranked in order of their accuracies from the best to the worst as RELM6˃WRELM6˃ELM6˃ORELM6˃MLR6. The RELM6 model improved the accuracy of the streamflow forecasts over the MLR6 model with respect to R2, NSE, RMSE, and MAE by 2.4%, 4.5%, 9.2%, and 14.8%, respectively. Streamflow forecasts made by all the implemented models are graphically compared with the observed streamflow in Fig. 11.4 for Topluca Station in the validation stage. It is evident from the figure that the ELM model, ELM6, resulted in the highest R2 value of 0.7484. On the other hand, observed and model-predicted streamflow values of Tozkoy Station are plotted together in Fig. 11.5. A comparison of the streamflow predictions made by the four ELM models with predictions of the MLR model for both the stations revealed that the MLR model resulted in the inferior forecasts than that made by the ELM models. Among the machine learning models, the RELM6 model has the less scattered predictions of streamflow with the highest R2 value (0.8387). It is further apparent from the streamflow time plots that all the models could not adequately predict a few monthly peak values of the streamflows. This might be due to the fact that the models were trained with a less number of data

A new heuristic model for monthly Chapter | 11

297

FIGURE 11.4 Comparison between observed and predicted values of streamflow and scatterplot of the best models in the validation period for the Topluca Station: (A) ORELM6, (B) ELM6, (C) RELM6, (D) WRELM6, and (E) MLR6.

including the less number of peak streamflow values, and thus, the data-driven methods could not learn the phenomenon that forms the peak streamflow points in the time series. The prediction results of the employed models are further compared in using Taylor diagram (Fig. 11.6) and box plot (Fig. 11.7). These figures confirmed the findings obtained from the time plots and scatter plots shown in the previous figure. Overall, results of the study concluded that

298 Advances in Streamflow Forecasting

FIGURE 11.5 Comparison between observed and predicted values of streamflow and scatterplot of the best models in the validation period for the Tozkoy Station: (A) ORELM6, (B) ELM6, (C) RELM6, (D) WRELM6, and (E) MLR6.

the MLR model is the worst in forecasting streamflows at both the stations and the ELM6 and RELM6 models yielded streamflow forecasts that were the closest to the measured ones at both Topluca and Tozkoy stations, respectively. Based on this study’s outcome, it is difficult to comment on relative performance and difference among the ELM models. However, it is clear that the ELM-predicted streamflow values are closer to the observed streamflow values for Tozkoy Station.

A new heuristic model for monthly Chapter | 11

299

FIGURE 11.6 Taylor diagram displaying a statistical comparison of the models with measured values of monthly streamflow Q (m3/s).

FIGURE 11.7 Box plots of measured and calculated values of monthly streamflow (Q) in the validation phase of Topluca and Tozkoy stations.

11.5 Conclusions In this chapter, a brief overview of the new ELM models is provided and their capabilities and robustness are examined in forecasting monthly streamflow. This investigation showed promising results, as the numerical evaluation drawn-up using statistical metrics revealed a good performance of the models. First, a correlation analysis was accomplished using the ACF and PACF for determining the optimal input lags, and a total of six previous time lags were selected as input variables. Second, the proposed ELM models were applied and compared with the six different scenarios of input combinations. The results revealed a clear variability in models’ performances and the models provided relatively similar results with a slight difference in their relative accuracy especially with an increase in the number of input time lags. The obtained results demonstrated the potential application of the ELM models in monthly streamflow forecasting. Outcomes of this study have a direct bearing on the ongoing research conducted during the last few years and the obtained statistical indices proved that the best heuristic models may be applied successfully for forecasting streamflow at both Topluca and Tozkoy stations of Turkey.

300 Advances in Streamflow Forecasting

References Abbasi, M., Farokhnia, A., Bahreinimotlagh, M., Roozbahani, R., 2020. A hybrid of random forest and deep auto-encoder with support vector regression methods for accuracy improvement and uncertainty reduction of long-term streamflow prediction. J. Hydrol. 125717. https://doi.org/ 10.1016/j.jhydrol.2020.125717. Adnan, R.M., Liang, Z., Heddam, S., Zounemat-Kermani, M., Kisi, O., Li, B., 2020. Least square support vector machine and multivariate adaptive regression splines for streamflow prediction in mountainous basin using hydro-meteorological data as inputs. J. Hydrol. 586, 124371. https://doi.org/10.1016/j.jhydrol.2019.124371. Adnan, R.M., Liang, Z., Trajkovic, S., Zounemat-Kermani, M., Li, B., Kisi, O., 2019. Daily streamflow prediction using optimally pruned extreme learning machine. J. Hydrol. 577, 123981. https://doi.org/10.1016/j.jhydrol.2019.123981. Adarsh, S., Reddy, M.J., 2016. Multiscale characterization of streamflow and suspended sediment concentration data using Hilbert-Huang transform and time dependent intrinsic correlation analysis. Model. Earth Syst. Environ. 2 (4), 1e17. https://doi.org/10.1007/s40808-016-0254-z. Al-Sudani, Z.A., Salih, S.Q., Yaseen, Z.M., 2019. Development of multivariate adaptive regression spline integrated with differential evolution model for streamflow simulation. J. Hydrol. 573, 1e12. https://doi.org/10.1016/j.jhydrol.2019.03.004. Cao, R., Cao, J., Mei, J.P., Yin, C., Huang, X., 2018a. Radar emitter identification with bispectrum and hierarchical extreme learning machine. Multimed. Tool. Appl. 78, 28953e28970. https:// doi.org/10.1007/s11042-018-6134-y. Cao, J., Wang, T., Shang, L., Lai, X., Vong, C.M., Chen, B., 2018b. An intelligent propagation distance estimation algorithm based on fundamental frequency energy distribution for periodic vibration localization. J. Franklin Inst. 355 (4), 1539e1558. https://doi.org/10.1016/ j.jfranklin.2017.02.011. Ciria, T.P., Labat, D., Chiogna, G., 2019. Detection and interpretation of recent and historical streamflow alterations caused by river damming and hydropower production in the Adige and Inn River Basins using continuous, discrete and multiresolution wavelet analysis. J. Hydrol. 578, 124021. https://doi.org/10.1016/j.jhydrol.2019.124021. Choubin, B., Solaimani, K., Rezanezhad, F., Roshan, M.H., Malekian, A., Shamshirband, S., 2019. Streamflow regionalization using a similarity approach in ungauged basins: application of the geo-environmental signatures in the Karkheh river basin, Iran. Catena 182, 104128. https:// doi.org/10.1016/j.catena.2019.104128. Cheng, Y., Zhao, D., Wang, Y., Pei, G., 2019. Multi-label learning with kernel extreme learning machine autoencoder. Knowl. Base Syst. 178, 1e10. https://doi.org/10.1016/ j.knosys.2019.04.002. Deng, W., Zheng, Q., Chen, L., 2009. Regularized extreme learning machine. In: IEEE Symposium on Computational Intelligence and Data Mining, pp. 389e395. https://doi.org/ 10.1109/CIDM.2009.4938676. Fang, W., Huang, S., Ren, K., Huang, Q., Huang, G., Cheng, G., Li, K., 2019. Examining the applicability of different sampling techniques in the development of decomposition-based streamflow forecasting models. J. Hydrol. 568, 534e550. https://doi.org/10.1016/ j.jhydrol.2018.11.020. Fathian, F., Fard, A.F., Ouarda, T.B., Dinpashoh, Y., Nadoushani, S.M., 2019. Modeling streamflow time series using nonlinear SETAR-GARCH models. J. Hydrol. 573, 82e97. https:// doi.org/10.1016/j.jhydrol.2019.03.072.

A new heuristic model for monthly Chapter | 11

301

Freire, P.K.D.M.M., Santos, C.A.G., da Silva, G.B.L., 2019. Analysis of the use of discrete wavelet transforms coupled with ANN for short-term streamflow forecasting. Appl. Soft Comput. 80, 494e505. https://doi.org/10.1016/j.asoc.2019.04.024. Fouchal, A., Souag-Gamane, D., 2019. Long-term monthly streamflow forecasting in humid and semiarid regions. Acta Geophys. 1e18. https://doi.org/10.1007/s11600-019-00312-3. Gibbs, M., McInerney, D., Humphrey, G., Thyer, M., Maier, H., Dandy, G., Kavetski, D., 2018. State updating and calibration period selection to improve dynamic monthly streamflow forecasts for an environmental flow management application. Hydrol. Earth Syst. Sci. 22, 871e887. https://doi.org/10.5194/hess-22-871-2018. Hornik, K., Stinchcombe, M., White, H., 1989. Multilayer feedforward networks are universal approximators. Neural Netw. 2, 359e366. https://doi.org/10.1016/0893-6080 (89)90020-8. Huang, G.B., Chen, L., Siew, C.K., 2006a. Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 17 (4), 879e892. https://doi.org/10.1109/TNN.2006.875977. Huang, G.B., Zhu, Q.Y., Siew, C.K., 2006b. Extreme learning machine: theory and applications. Neurocomputing 70 (1e3), 489e501. https://doi.org/10.1016/j.neucom.2005.12.126. Huang, N.E., Shen, Z., Long, S.R., Wu, M.C., Shih, H.H., Zheng, Q., Yen, N.C., Tung, C.C., Liu, H.H., 1998. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lon. Series A Math. Phys. Eng. Sci. 454 (1971), 903e995. https://doi.org/10.1098/rspa.1998.0193. Hu, H., Zhang, J., Li, T., 2020. A Comparative Study of VMD-Based Hybrid Forecasting Model for Nonstationary Daily Streamflow Time Series. Complexity. https://doi.org/10.1155/2020/ 4064851. Heddam, S., 2017. Generalized regression neural network based approach as a new tool for predicting total dissolved gas (TDG) downstream of spillways of dams: a case study of Columbia River basin dams, USA. Environ. Proc. 4 (1), 235e253. https://doi.org/10.1007/ s40710-016-0196-5. Kahya, E., Kalayci, S., 2004. Trend analysis of streamflow in Turkey. J. Hydrol. 289 (1e4), 128e144. https://doi.org/10.1016/j.jhydrol.2003.11.006. Kisi, O., Aytek, A., 2013. Explicit neural network in suspended sediment load estimation. Neural Netw. World 6/13, 587e607. http://www.nnw.cz/doi/2013/NNW.2013.23.035.pdf. Kisi, O., Parmar, K.S., 2016. Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution. J. Hydrol. 534, 104e112. https://doi.org/10.1016/j.jhydrol.2015.12.014. Luo, X., Yuan, X., Zhu, S., Xu, Z., Meng, L., Peng, J., 2019. A hybrid support vector regression framework for streamflow forecast. J. Hydrol. 568, 184e193. https://doi.org/10.1016/ j.jhydrol.2018.10.064. Labat, D., 2005. Recent advances in wavelet analyses: part 1. A review of concepts. J. Hydrol. 314 (1e4), 275e288. https://doi.org/10.1016/j.jhydrol.2005.04.003. Labat, D., Ronchail, J., Guyot, J.L., 2005. Recent advances in wavelet analyses: Part 2-Amazon, Parana, Orinoco and Congo discharges time scale variability. J. Hydrol. 314 (1e4), 289e311. https://doi.org/10.1016/j.jhydrol.2005.04.004. Meng, E., Huang, S., Huang, Q., Fang, W., Wu, L., Wang, L., 2019. A robust method for nonstationary streamflow prediction based on improved EMD-SVM model. J. Hydrol. 568, 462e478. https://doi.org/10.1016/j.jhydrol.2018.11.015. Mazrooei, A., Sankarasubramanian, A., 2019. Improving monthly streamflow forecasts through assimilation of observed streamflow for rainfall-dominated basins across the CONUS. J. Hydrol. 575, 704e715. https://doi.org/10.1016/j.jhydrol.2019.05.071.

302 Advances in Streamflow Forecasting Mehdizadeh, S., Fathian, F., Adamowski, J.F., 2019. Hybrid artificial intelligence-time series models for monthly streamflow modeling. Appl. Soft Comput. 80, 873e887. https://doi.org/ 10.1016/j.asoc.2019.03.046. Mihailovic, D.T., Nikolic-Ðoric, E., Arsenic, I., Malinovic-Milicevic, S., Singh, V.P., Stosic, T., Stosic, B., 2019. Analysis of daily streamflow complexity by Kolmogorov measures and Lyapunov exponent. Phys. Stat. Mech. Appl. 525, 290e303. https://doi.org/10.1016/ j.physa.2019.03.041. Nanda, T., Sahoo, B., Chatterjee, C., 2019. Enhancing real-time streamflow forecasts with waveletneural network based error-updating schemes and ECMWF meteorological predictions in Variable Infiltration Capacity model. J. Hydrol. 575, 890e910. https://doi.org/10.1016/ j.jhydrol.2019.05.051. Peng, Y., Wang, S., Long, X., Lu, B.L., 2015. Discriminative graph regularized extreme learning machine and its application to face recognition. Neurocomputing 149, 340e353. https:// doi.org/10.1016/j.neucom.2013.12.065. Poul, A.K., Shourian, M., Ebrahimi, H., 2019. A comparative study of MLR, KNN, ANN and ANFIS models with wavelet transform in monthly stream flow prediction. Water Resour. Manag. 1e17. https://doi.org/10.1007/s11269-019-02273-0. Rezaie-Balf, M., Kim, S., Fallah, H., Alaghmand, S., 2019. Daily river flow forecasting using ensemble empirical mode decomposition based heuristic regression models: application on the perennial rivers in Iran and South Korea. J. Hydrol. 572, 470e485. https://doi.org/10.1016/ j.jhydrol.2019.03.046. Sensoy, S., Demircan, M., Ulupınar, U., Balta, I., 2008. Turkey Climate. DMI Web Site (in Turkish). http://www.dmi.gov.tr/iklim/iklim.aspx. Tikhamarine, Y., Souag-Gamane, D., Kisi, O., 2019. A new intelligent method for monthly streamflow prediction: hybrid wavelet support vector regression based on grey wolf optimizer (WSVR-GWO). Arab. J. Geosci. 12 (17), 540. https://doi.org/10.1007/s12517-019-4697-1. Wang, L., Li, X., Ma, C., Bai, Y., 2019a. Improving the prediction accuracy of monthly streamflow using a data-driven model based on a double-processing strategy. J. Hydrol. 573, 733e745. https://doi.org/10.1016/j.jhydrol.2019.03.101. Wang, Y., Yang, L., Yuan, C., 2019b. A robust outlier control framework for classification designed with family of homotopy loss function. Neural Netw. 112, 41e53. https://doi.org/10.1016/ j.neunet.2019.01.013. Xu, Z., Yao, M., Wu, Z., Dai, W., 2016. Incremental regularized extreme learning machine and it‫׳‬s enhancement. Neurocomputing 174, 134e142. https://doi.org/10.1016/j.neucom.2015.01.097. Yang, L., Yang, B., Jing, S., 2019. A minimax probability extreme machine framework and its application in pattern recognition. Eng. Appl. Artif. Intell. 81, 260e269. https://doi.org/ 10.1016/j.engappai.2019.02.012. Yaseen, Z.M., El-shafie, A., Jaafar, O., Afan, H.A., Sayl, K.N., 2015. Artificial intelligence based models for stream-flow forecasting: 2000-2015. J. Hydrol. 530, 829e844. https://doi.org/ 10.1016/j.jhydrol.2015.10.038. Yaseen, Z.M., Sulaiman, S.O., Deo, R.C., Chau, K.W., 2019a. An enhanced extreme learning machine model for river flow forecasting: State-of-the-art, practical applications in water resource engineering area and future research direction. J. Hydrol. 569, 387e408. https:// doi.org/10.1016/j.jhydrol.2018.11.069. Yaseen, Z.M., Mohtar, W.H.M.W., Ameen, A.M.S., Ebtehaj, I., Razali, S.F.M., Bonakdari, H., Salih, S.Q., Al-Ansari, N., Shahid, S., 2019b. Implementation of univariate paradigm for streamflow simulation using hybrid data-driven model: case study in tropical region. IEEE Access 7, 74471e74481. https://doi.org/10.1109/ACCESS.2019.2920916.

A new heuristic model for monthly Chapter | 11

303

Zhang, K., Luo, M., 2015. Outlier-robust extreme learning machine for regression problems. Neurocomputing 151, 1519e1527. https://doi.org/10.1016/j.neucom.2014.09.022. Zhou, Y., Guo, S., Chang, F.J., 2019. Explore an evolutionary recurrent ANFIS for modelling multi-step-ahead flood forecasts. J. Hydrol. 570, 343e355. https://doi.org/10.1016/ j.jhydrol.2018.12.040. Zamoum, S., Souag-Gamane, D., 2019. Monthly streamflow estimation in ungauged catchments of northern Algeria using regionalization of conceptual model parameters. Arab. J. Geosci. 12 (11), 342. https://doi.org/10.1007/s12517-019-4487-9.

Chapter 12

Hybrid artificial intelligence models for predicting daily runoff Anurag Malik1, 2, Anil Kumar2, Yazid Tikhamarine3, 4, Doudja Souag¨ zgur Kis¸i6 Gamane5, O 1

Punjab Agricultural University, Regional Research Station, Bathinda, Punjab, India; Department of Soil and Water Conservation Engineering, College of Technology, G.B. Pant University of Agriculture and Technology, Pantnagar, Uttarakhand, India; 3Southern Public Works Laboratory (LTPS), Tamanrasset Antenna, Tamanrasset, Algeria; 4Department of Science and Technology, University of Tamanrasset, Sersouf, Tamanrasset, Algeria; 5Leghyd Laboratory, University of Sciences and Technology Houari Boumediene, Bab Ezzouar, Algiers, Algeria; 6 Department of Civil Engineering, School of Technology, Ilia State University, Tbilisi, Georgia 2

12.1 Introduction Runoff forecasting on short-term (daily and hourly) and long-term (weekly, monthly, and annual) time scales basis is profoundly important for sustainable planning and management of water resources, for example, inhibition and control of floods and droughts, reservoir operations, hydropower generation, irrigation management, sediment transport, and water resources allocations (Jiang et al., 2018; Yaseen et al., 2015). The runoff generation process is principally influenced by climatic factors, human activities, and watershed/ catchment characteristics (Chang et al., 2017). In literature, researchers have used different process-based or physical hydrological models to predict runoff (Jaiswal et al., 2020; LV et al., 2020; Pradhan et al., 2020; Raihan et al., 2020); however, it ought to be noticed that runoff patterns are dependent on spatiotemporal variability and regional heterogeneity of hydrological variables. Also, process-based models required several kinds of input data for streamflow or runoff forecasting, which are sometimes not available especially in developing countries. Thus, the complexities arising from the hydrological variations result in uncertainties in runoff prediction using physical hydrological models. Uncertainty in runoff or streamflow prediction has been greatly dealt with by the introduction of data-driven empirical and statistical models in hydrology. One of the traditional statistical models widely used in streamflow Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00009-3 Copyright © 2021 Elsevier Inc. All rights reserved.

305

306 Advances in Streamflow Forecasting

forecasting is the autoregressive integrated moving average (ARIMA) and its many subsequent expansions (Viccione et al., 2020). Artificial intelligence (AI) method, including artificial neural network (ANN), support vector machines (SVM), adaptive neuro-fuzzy inference system (ANFIS), and genetic programming (GP), among others, is one of the most advanced techniques that has been employed in hydrological prediction and modeling studies worldwide (Ali and Shahbaz, 2020; Ateeq-ur-Rauf et al., 2018; Danandeh Mehr, 2018; Granata et al., 2016; Hadi and Tombul, 2018; Hussain and Khan, 2020). In recent times, researchers have started utilizing AI models to forecast streamflow variables (Diop et al., 2018; Shoaib et al., 2016; Yaseen et al, 2017, 2019). Wang et al. (2009) examined the performance of four AI techniques, i.e., ANFIS, SVM, ANN, and GP, by comparing with the performance of autoregressive moving average (ARMA) modeling using long-term (1953e2004 and 1951e2004) measurements of monthly streamflow in two rivers (Lancangjiang and Wujiang rivers) of China, and they concluded that the SVM model performed with better accuracy than the other techniques during both training and testing stages. Over the last decade, data-driven models have been applied worldwide in a large number of studies dealing with modeling and forecasting of various hydrological variables including runoff (Adnan et al., 2018; Ahani et al., 2018; Ghorbani et al., 2018; Kis¸i, 2015; Nourani et al., 2019; Qi et al., 2019; RezaieBalf et al., 2019; Yaseen et al., 2016c). Yaseen et al. (2016a) applied radial basis function neural network (RBFNN) and feedforward backpropagation neural network models to forecast daily streamflow at Johor River, Malaysia, and they reported relatively better performance of the RBFNN model in the forecasting of streamflow. Yaseen et al. (2016b) examined the potential of extreme learning machine (ELM) against the generalized regression neural network (GRNN) and support vector regression (SVR) to forecast the monthly streamflow in Tigris River, Iraq. The comparative results indicated the superior performance of ELM over the GRNN and SVR techniques. Hadi and Tombul (2018) utilized ANN, ANFIS, SVM, and autoregressive (AR) models for forecasting daily streamflow in three basins located in Seyhan River Basin in Turkey, and results indicated the better performance of ANN and ANFIS over the basins. Yaseen et al. (2018) predicted monthly streamflow in Tigris River, Iraq, using wavelet-extreme learning machine (WA-ELM) and ELM models, and they found that the coupled WA-ELM model enhanced the streamflow predictability. Nourani et al. (2019) investigated the performance of the hybrid wavelet-M5 model tree (W-M5) model against ANN and M5 models in simulating the rainfall-runoff process on a daily and monthly basis in Aji Chai Catchment (Iran) and Murrumbidgee Catchment (Australia). The results showed the superior performance of hybrid W-M5 models over the ANN and M5 models. Recently, metaheuristic algorithms have received more attention in the areas of optimization and modeling (Dai et al., 2018; Dehghani et al., 2019;

Hybrid artificial intelligence models Chapter | 12

307

Faris et al., 2016; Mirjalili et al., 2020; Tikhamarine et al., 2019b). Although the metaheuristic algorithms may be classified into many ways, some of the algorithms are classified into population-based algorithms such as genetic algorithm (GA) and GP, as a set of strings is utilized in these algorithms, and the particle swarm optimization (PSO), which utilizes particles called agents (Kennedy and Eberhart, 1995). Likewise, some of the metaheuristic algorithms are classified into the nature-inspired algorithms such as grey wolf optimizer (GWO), whale optimization algorithm (WOA), and multi-verse optimizer (MVO) (Mirjalili et al, 2014, 2016; Mirjalili and Lewis, 2016). The WOA algorithm has been used in many optimization and modeling problems and has demonstrated superiority over other algorithms in finding the best global solutions. Al-Zoubi et al. (2018) used WOA for the training of SVM and defining the best parameters. The proposed WOA algorithm was compared with the PSO and GA. The results indicated that the WOA model is superior to the GA and PSO algorithms in terms of accuracy and precision. The WOA has also been successfully used in feature selection problems (Mafarja and Mirjalili, 2017) where the WOA is assessed with 18 standard benchmark datasets, and the experimental results evidenced the efficiency of the proposed WOA in improving the classification ability as compared to other algorithms. Aljarah et al. (2018) used WOA for tuning the weights or parameters in ANNs, and the results proved that the algorithm is capable to solve various optimization problems and perform superior to the existing algorithms. This chapter provides a theoretical description of multilayer perceptron (MLP) neural network and SVR models along with two metaheuristic optimizers, i.e., WOA and GWO. Then, new hybrid AI models are proposed based on SVR and MLP neural networks in conjunction with two recently developed WOA and GWO metaheuristic optimizers. In addition, this chapter includes a case study where the applicability of MLP neural network and SVR models along with hybrid AI models (hybrid MLP and SVR) is demonstrated by forecasting runoff in a hilly watershed of the upper Ramganga River catchment (RRC), Uttarakhand, India. Finally, the capability of the suggested hybrid AI models in forecasting runoff is comparatively evaluated over multiple linear regression (MLR) model based on three performance evaluation indicators and through visual inspection.

12.2 Theoretical background of MLP and SVR models 12.2.1 Support vector regression model Vapnik (1995) developed SVM, which is dependent on the theory of statistical learning and structural risk minimization principle. The SVR is a kind of SVM technique based on regression functions (Smola and Scho¨lkopf, 2004). By applying the idea of SVM in solving the regression problem, the SVR can satisfactorily map the nonlinear hidden patterns in the original data. Thus, the

308 Advances in Streamflow Forecasting

SVR model is used to resolve the prediction and nonlinear regression problems in a time series. The SVR regression function, f ðxÞ , is defined as (Vapnik, 1995) f ðxÞ ¼ w  fðxÞ þ b

(12.1)

where w is the weight vector, b is the bias, and f is the transfer function. Thus, the regression problem to obtain a proper SVR function f ðxÞ can be expressed as (Tikhamarine et al., 2020) Minimize

N  X  1 zi þ zi kwk2 þ C 2 i¼1

8 yi w  4ðxi Þ  b  ε þ xi > > < subject to w  4ðxi Þ þ b  yi  ε þ xi > > : xi ; xi  0; i ¼ 1; 2; .; N

(12.2)

(12.3)

where C > 0 is a penalty constant, ε is the error tolerance range of the function, xi and xi are slack variables corresponding to the boundary values of ε, which determines the trade-off between the training error and the model flatness. After introducing the Lagrange multiplier, a nonlinear regression function can be expressed as f ðxÞ ¼

n  X  ai  ai :Kðx; xi Þ þ b

(12.4)

i¼1

where ai and ai  0 are the Lagrange multipliers and Kðx; xi Þ is the Kernel function. The Kernel function is used to change the dimensionality of input space and produce more accurate regression results (Azamathulla and Wu, 2011). There are several Kernel functions, for example, linear, polynomial, sigmoid, and radial basis function (RBF), which can be expressed as (Elshorbagy et al., 2010) 8 xTi x þ C/ðLinearÞ > > > >  d > > < l þ xTi x /ðPolynomialÞ Kðx; xi Þ ¼ (12.5)   2 > 2 > > exp  kxi  xk =2s /ðRBFÞ > > >   : tanh sxTi x þ q2 /ðSigmoidÞ where x and xi are input space vectors, c, d, s, and q are Kernel functions’ parameters. All the mentioned parameters ought to be optimized for the best performance.

Hybrid artificial intelligence models Chapter | 12

309

12.2.2 Multilayer perceptron neural network model The MLP neural network is a widespread kind of ANNs with three layers: an input layer, one intermediate layer or hidden layer, and an output layer (McClelland and Rumelhart, 1989). Each neuron gets inputs array and provides a single output; the input layer produces the outputs, which are used as inputs to the hidden layer. Similarly, the hidden layer produces the outputs, which are used as inputs to another hidden layer and finally to the output layer. A processing function called activation function is used in each neuron of hidden and output layers. The explicit expression for computing the output value Q of an MLP neural network model with three layers can be given by the following equation (Tikhamarine et al., 2019a): " ! # m n X X Q ¼ Fo Wkj  Fh Wji Xi þ bjo þ bko (12.6) i¼1

i¼1

where Q is the daily runoff predicted by the MLP neural network model, Xi are the inputs variables, Fo is the activation or transfer function of the output layers’ neurons, Fh is the activation function of the hidden layers’ neurons, Wkj is a weight connecting hidden layer with the output layer, Wji is a weight in the hidden layer, bko is the bias for the output layer, bjo is the bias in the hidden layer, n is the number of inputs, and m is the number of hidden neurons, i, j, and k are the input, hidden, and output layers, respectively.

12.2.3 Grey wolf optimizer algorithm The GWO is a new swarm optimization algorithm dependent on the social hunting behavior of the grey wolves developed by Mirjalili et al. (2014). The GWO algorithm was inspired by examining the social hunting of the grey wolves in nature and their searching to find out the optimal path to hunt their prey. Thus, the social hierarchy of wolves may be modeled using the GWO algorithm by dividing the pack of wolves into four sets: (i) alpha (a), (ii) beta (ß), (iii) delta (d), and (iv) omega (u). Of the listed four sets, alpha (a) is considered as the best solution followed by ß and d as the second and third best solutions, respectively, and the remaining candidate solutions as u. The mechanism of wolves hunting includes steps of tracking, encircling, and attacking the prey. The formulation of the behavior of encircling the prey is as follows (Mirjalili et al., 2014):   ! ! ! !  D ¼  C $ X P ðtÞ  X ðtÞ (12.7) ! ! ! ! X ðt þ 1Þ ¼ X P ðtÞ  A $ D

(12.8)

! where D is the distance between the prey and grey wolf, X is the position ! vector of a grey wolf, t indicates the current iteration, X P is the position vector

310 Advances in Streamflow Forecasting

! ! of the prey, and A and C are coefficient vectors. The coefficient vectors can be computed using mathematical expressions defined as (Mirjalili et al., 2014) ! A ¼ 2! a $! r 1! a (12.9) ! C ¼ 2$! r2

(12.10)

where ! a is a vector whose elements linearly decrease from 2 to 0 throughout iterations, r1 and r2 are randomly created vectors in the range of [0, 1]. The u wolves update their locations around a, b, and d in the optimization process. Accordingly, the u wolves can get reposition concerning a, b, and d as defined by the following equations (Mirjalili et al., 2014):   ! ! ! !  D a ¼  C 1 $ X a ðtÞ  X  (12.11)   ! ! ! ! D b ¼  C 2 $ X b ðtÞ  X 

(12.12)

  ! ! ! ! D d ¼  C 3 $ X d ðtÞ  X 

(12.13)

! ! ! ! X 1 ¼ X a ðtÞ  A 1  D a

(12.14)

! ! ! ! X 2 ¼ X b ðtÞ  A 2  D b

(12.15)

! ! ! ! X 3 ¼ X d ðtÞ  A 3  D d

(12.16)

The above equations indicate how grey wolves can update their positions. It is accepted that a, b, and d wolves are near to the prey and lead u wolves to the prey location. The grey wolves can use these first three positions to determine the prey position. According to three equations of X vectors, the hunting for a new location for the leading wolves forced the omega wolves to update their positions to be closer to prey. ! ! ! ! X ðt þ 1Þ ¼ X 1 þ X 2 þ X 3 (12.17) 3 ! where X ðt þ1Þ is the next iteration position.

12.2.4 Whale optimization algorithm WOA is a novel nature-inspired metaheuristic optimization algorithm, which mimics humpback whale behavior in the ocean. The WOA algorithm was introduced by Mirjalili and Lewis (2016), and it has been hybridized with several machine learning methods such as SVM, ANN, etc. (Mafarja and Mirjalili, 2018). In fact, whales swim around the prey spirally and build

Hybrid artificial intelligence models Chapter | 12

311

bubbles to arrest the prey (bubble-net feeding method). Just like the other metaheuristic algorithms, WOA has two stages of exploitation and exploration. In the exploration phase, WOA begins to approximate the global optimum by initiating whale’s population at a random position. Thereafter, the whales will start to randomly searching for prey in the surrounding areas and move toward it as shown in following equations (Mirjalili and Lewis, 2016):   !  _ ! !  D ¼ D: X  ðtÞ  X ðtÞ (12.18)   ! ! ! !  X ðt þ 1Þ ¼  X  ðtÞ  A : D 

(12.19)

! where t presents the current iteration, X is the current whales position vector, ! ! and X  is the position vector of the best solution. A and C are coefficient vectors, which are defined as (Mirjalili and Lewis, 2016) ! A ¼ 2! a :! r ! a (12.20) ! C ¼ 2:! r

(12.21) ! ! where r is a random vector in [0, 1] and a decreased linearly from 2 to 0 through the course iterations (in both exploration and exploitation phases) as defined below: ! a ¼ 2ð1  t = TÞ (12.22) The spiral principle is then utilized to demonstrate the spiral way pursued by whales shown as follows (Mirjalili and Lewis, 2016): ! ! (12.23) X ðt þ 1Þ ¼ D0 :ebl :cosð2plÞ þ X  ðtÞ   ! !   D0 ¼  X  ðtÞ  X ðtÞ (12.24) where D0 is the distance of the ith whale to the prey (best solution obtained so far), b is a constant for defining the shape of the logarithmic spiral, l is a random number in the interval [1, 1], and : is an element-by-element multiplication. A probability of 50% is utilized to choose between the shrinking encircling and the spiral-shaped path as follows:  !  ! ! !  X ðt þ 1Þ ¼  X  ðtÞ  A : D in Eq.ð23Þ if P < 0.5ðshrinking encirclingÞ (12.25) ! ! 0 bl  X ðt þ 1Þ ¼ D :e :cosð2plÞ þ X ðtÞ in Eq.ð27Þ if P  0.5ðspiral  shaped pathÞ

(12.26)

312 Advances in Streamflow Forecasting

where P is a random number in [0, 1]. In the exploitation phase, the position of a whale is updated by an arbitrarily chosen search agent rather than the best solution in the exploitation stage.  As such, the whale moves far away from !  random search agent when  A  > 1. The search of the WOA is performed as follows (Mirjalili and Lewis, 2016):   ! ! ! ! D ¼  C :Xrand  X  ! ! ! ! X ðt þ 1Þ ¼ Xrand  A : D

(12.27) (12.28)

! where Xrand is a random search agent chosen from the current population, A is a controller parameter that controls switching between exploration and exploitation, and the parameter p controls the switching between either a spiral or circular movement (Mirjalili and Lewis, 2016). At each iteration, the search agents’ positions are generated with respect to the best solution when   !  !   A  < 1 or a randomly selected search agent when  A   1. This procedure is repeated until satisfying a stopping constraint, and eventually, the algorithm provides the best solution.

12.2.5 Hybrid MLP neural network model The MLP neural network modeling involves arbitrary weight assignment, which is vigorous to give local optimal training time and convergence speed for the network. Weights are randomly chosen in the range [0,1], and the weights of each neuron are adjusted after the transfer function for the next iteration (Tikhamarine et al., 2019a). In hybrid MLP neural network models, the weights and biases were optimized by using new metaheuristic algorithms. The GWO and WOA algorithms optimize the weights and biases (parameters of MLP neural network model) for each neuron of the network. It is important to have all connection weights and biases for the training of the system as well as to minimize the root mean square error (RMSE). In order to optimize the MLP neural network model, it is important to have a fitness function of GWO and WOA with a minimum value of RMSE. This is needed to identify the initial weights and biases for the training of the MLP neural network model. After meeting certain termination criteria or reaching the maximum number of iterations, training of the model is stopped. The hybrid model coupling the MLP neural network and GWO generates the MLP-GWO model, while the synchronization of the WOA algorithm with the MLP neural network generates the MLP-WOA model. The architecture of the hybrid MLP neural network with GWO and WOA metaheuristic algorithms demonstrating the step-by-step procedure of model implementation is illustrated in Fig. 12.1A.

Hybrid artificial intelligence models Chapter | 12

313

FIGURE 12.1 Flowchart of the proposed hybrid (A) MLP, and (B) SVR models.

12.2.6 Hybrid SVR model In general, the accuracy of the SVR modeling depends on the selection of adequate SVR parameters and Kernel function. In a case study presented in this chapter, the RBF was chosen as a Kernel function with only one parameter (g). Thus, the total number of parameters adapted would be only three with two more computed parameters, i.e., the ε-insensitive parameter and the penalty (C). The best choices of SVR parameters are not known to the problem, and hence, getting optimal values for these parameters is computationally hard. Therefore, the hybrid AI algorithms are implemented to ease the difficult task of obtaining optimal values of three parameters. In hybrid SVR models, the SVR parameters were optimized by using two new metaheuristic algorithms of GWO and WOA. The hybrid strategy coupling the SVR and GWO generates the SVR-GWO model, while the conjunction of the WOA algorithm with SVR produces the SVR-WOA model. A flowchart illustrating the step-by-step procedure for implementing the hybrid SVR model coupled with GWO and WOA metaheuristic algorithms is shown in Fig. 12.1B.

314 Advances in Streamflow Forecasting

12.3 Application of hybrid MLP and SVR models in runoff prediction: a case study 12.3.1 Study area and data acquisition The study area selected for application of hybrid AI models in runoff prediction was Naula watershed situated in Ranikhet Forest subdivision of upper RRC, Uttarakhand, India (Fig. 12.2). The study area is located between 29 44’ N and 30 6’20’’ N latitudes and 79 06’15’’ E and 79 31’15’’ E longitudes with elevations varying from 709 to 3079 m from mean sea level. The watershed has a rectangular shape and comprises an area of 1023 km2 with hilly terrain. The average annual precipitation in the area is 1014 mm, a major portion of which is received during monsoon season (June to September). Daily rainfall and runoff data of monsoon season (from June 2000 to September 2004) were collected from Soil Conservation Divisional Forest Office, Ranikhet, Uttarakhand, India. The available rainfall and runoff data were partitioned into two sets: (i) training dataset including data from June 2000 to September 2003 (4 years) and (ii) testing dataset including data from June 2004 to September 2004 (last 1 year). The statistical parameters of rainfall and runoff for the training and testing periods are presented in Table 12.1. The rainfall varied from 0 to 98.20 mm with a mean value of 3.88 mm, standard deviation (SD) of 9.74 mm, skewness of 4.51, and kurtosis of 28.36 for the training period. For the testing period, rainfall varied from 0 to

FIGURE 12.2

Location map of Naula watershed, Uttarakhand, India.

Statistical parameters Variables

Dataset

Minimum

Maximum

Mean

Standard deviation

Skewness

Kurtosis

Rainfall (mm)

Training

0

98.20

3.88

9.74

4.51

28.36

Testing

0

117.20

5.02

13.52

5.50

39.89

Training

7.36

1484.64

231.37

180.75

2.04

8.24

Testing

1.73

2160.40

249.83

307.30

3.09

13.28

3

Runoff (m /s)

Hybrid artificial intelligence models Chapter | 12

TABLE 12.1 Basic statistical properties of rainfall and runoff datasets at Naula watershed, Uttarakhand, India.

315

316 Advances in Streamflow Forecasting

117.20 mm with a mean value of 5.02 mm, SD of 13.52 mm, skewness of 5.50, and kurtosis of 39.89. The runoff varied from 7.36 to 1484.64 m3/s with the mean value of 231.37 m3/s, SD of 180.75 m3/s, skewness of 2.04, and kurtosis of 8.24 for the training period. For the testing period, runoff varied from 1.73 to 2160.40 m3/s with the mean value of 249.83 m3/s, SD of 307.30 m3/s, skewness of 3.09, and kurtosis of 13.28. Fig. 12.3 illustrates the graphical presentation of total available rainfall and runoff data for the study area. In this work, the SVR models were employed by utilizing the LIBSVM 3.23 program (Chang and Lin, 2011). The program is applied to the ε-SVR and adopts the RBF as the Kernel function. The RBF Kernel function is adopted in this work as it is widely used in the literature (Kis¸i and Cimen, 2011; Sihag et al., 2018). In the case of MLP neural network, the most common activation functions utilized by the literary works were used (Aljarah et al., 2018; Faris et al., 2016; Mirjalili, 2015).

12.3.2 Gamma test for evaluating the sensitivity of input variables In modeling a nonlinear hydrological process, the identification of appropriate inputeoutput is very tedious. To sort out this problem nowadays, the gamma test (GT) has been employed widely in diverse fields (Kakaei Lafdani et al., 2013; Malik et al., 2019; Piri et al., 2009; Rashidi et al., 2016). The GT is a mathematically proved smooth nonlinear tool to estimate the variance of the noise on the output variable, which could be an estimate of the minimum mean squared error that a smooth model can be achieved for the corresponding output. The GT was firstly proposed by Stefa´nsson et al. (1997) and it was further developed and utilized in lots of studies through several applications

FIGURE 12.3 Time series plot of daily rainfall and runoff datasets at study basin.

Hybrid artificial intelligence models Chapter | 12

317

over different parts of the globe (Ashrafzadeh et al., 2020; Choubin and Malekian, 2017; Malik et al., 2017a, 2018, 2019; Noori et al., 2011; Rashidi et al., 2016). The relationship between inputs and outputs is written as (Elshorbagy et al., 2010) y ¼ f ðx1 .xm Þ þ ε

(12.29)

where f is the smooth function, and ε is the noise, and that the variance of the noise Var(ε) is bounded. The GT is based on L½i; k, which are the kth ð1  k  pÞ nearest neighbors xL½i;k ð1  k  pÞ for each vector xi ð1  i  NÞ. The G-test is derived from the Delta (d) and g functions as follows (Stefa´nsson et al., 1997): N   1 X xL½i;k  xi 2 .ð1  k  pÞ N i¼1

(12.30)

N   1 X yL½i;k  yi 2 .ð1  k  pÞ 2N i¼1

(12.31)

dN ðkÞ ¼

gN ðkÞ ¼

where p is a preselected value, yL½i;k is corresponding output value for the kth nearest neighbors of xi in Eq. (12.30). Thus, a least squares regression line is constructed for the p points [dN ðkÞ, gN ðkÞ] to compute gamma (Г) (Piri et al., 2009): y ¼ Ad þ G

(12.32)

where y is the output vector, A is the gradient of the regression line, and G is the intercept on the vertical axis (d ¼ 0). In addition, gradient (A) indicates the complexity of the system (i.e., steeper gradient indicates greater complexity), standard error (SE) shows the reliability of the Г; small SE means more reliable Г, and Vratio is a scale-invariant noise estimate where Г is divided by the variance of the output variable and expressed as (Malik et al., 2020) Vratio ¼

G s2 ðyÞ

(12.33)

where s2 ðyÞ is the output variance, Г is the gamma function. A Vratio close to zero indicates high degree of predictability of the output variable. The GT was utilized at the beginning to recognize the effectiveness of each variable for the input combined with a blend of the associated information factors. The GT is a kind of sensitivity analysis that evaluates the relative effect of excluding one input variable from the analysis on the output. First, one of the input variables was excluded from the initial combination of input parameters, and GT was accomplished for the remaining variables. Subsequently, the dismissed variable was included back in the combination of input variables and another variable was excluded. This procedure was repeated to

318 Advances in Streamflow Forecasting

exclude every variable once from the combination of input parameters, and every time, the Gamma (G) value was computed. In this iterative procedure, elimination of the effective variables increases the G value, whereas elimination of a less important variable decreases the G value. Finally, the combination of input parameters that yields smaller values of G, SE, and Vratio indicates the optimal combination of input data to be used for further analysis.

12.3.3 Multiple linear regression MLR is a generalization of the simple regression equation and one of the classical problems in statistical analysis (Malik and Kumar, 2015; Tabari et al., 2011). The MLR usually finds an appropriate relationship between a dependent variable and one or more independent variables. The MLR generates the dependent relationship by fitting a linear equation in the form of the following equation (Malik et al., 2017b): Y ¼ a0 þ a1 X1 þ a2 X2 þ a3 X3 þ a4 X4 þ /// þ an Xn

(12.34)

where Y is the dependent variable (Qt) and X1 to Xn are the independent variables; and a0 to an are the coefficients for the linear regression.

12.3.4 Performance evaluation indicators In this study, the performance of developed hybrid SVR-WOA, SVR-GWO, MLP-GWO, MLP-WOA, and MLR models was evaluated by utilizing performance evaluation indicators both statistical, i.e., RMSE, Pearson correlation coefficient (PCC), and Willmott index (WI), and visual using line and scatter plots and Taylor diagram (Taylor, 2001). The RMSE (Malik et al., 2017b), PCC (Malik et al., 2019), and WI (Willmott, 1981) are mathematically expressed as vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N  u1 X 2 RMSE ¼ t Qobs; i  Qpre;i ð0 < RMSE < NÞ (12.35) N i¼1 PN 

  Qobs;i  Qobs Qpre;i  Qpre PCC ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 P  2 PN  N  Q  Q Q Q obs;i obs pre;i pre i¼1 i¼1 i¼1

ð1 < PCC < 1Þ (12.36)

2

3

2 PN  i¼1 Qpre;i  Qobs; i 4 WI ¼ 1  P    2 5     N i¼1 Qpre;i  Qobs  þ Qobs;i  Qobs 

ð0 < WI  1Þ (12.37)

where Qobs and Qpre are observed and predicted daily runoff values for the ith dataset, N ¼ total number of observations in a dataset, Qobs and Qpre are the

Hybrid artificial intelligence models Chapter | 12

319

    average of observed and predicted runoff values, respectively, Qpre;i Qobs  is  the absolute difference between predicted and mean of observed runoff, and   Qobs;i Qobs  is the absolute difference between observed and their mean runoff. The model with a minimum value of RMSE and higher values of PCC and WI was judged as the best fit for prediction of daily runoff in the study area.

12.4 Results and discussion 12.4.1 Identification of appropriate input variables using gamma test In this study, the GT was applied on seven different combinations of seven different input parameters (i.e., Rt, Rt1, Rt2, Rt3, Qt1, Qt2, and Qt3) to determine the impact of individual parameters on output (i.e., runoff: Qt) in Naula watershed (Table 12.2). The first combination called the initial set comprised of all seven input parameters. The second combination comprised of six input parameters (all parameters - Rt) that means Rt (i.e., rainfall at t time) was omitted from the initial set. Likewise, the third combination comprised of six input parameters (all parameters - Rt1), which means Rt1 was omitted, and Rt was included in the initial set, and thus, the process was

TABLE 12.2 Preprocessing analysis to identify optimal variables in the study area. Gamma test statistics Input parameters

Г

SE

Vratio

Mask

Rt, Rt1, Rt2, Rt3, Qt1, Qt2, Qt3

15,671.53

2,532.84

0.35

1111111

All - Rt,

16,011.66

2,993.25

0.36

0111111

All - Rt1

16,095.61

2,998.25

0.36

1011111

All - Rt2

14,122.94

2,038.70

0.31

1101111

All - Rt3

15,774.08

2,659.31

0.35

1110111

All - Qt1

22,772.27

3,843.71

0.51

1111011

All - Qt2

18,427.36

3,324.14

0.41

1111101

All - Qt3

18,182.54

1,724.83

0.40

1111110

Qt1, Qt2, Qt2, runoff of previous 1 day, previous 2 days, and previous 3 days; Rt, Rt1, Rt2, Rt3, rainfall of current, previous 1 day, previous 2 days, and previous 3 days; SE, standard error; Г, gamma.

320 Advances in Streamflow Forecasting

repeated for the rest of the combinations as given in Table 12.2 (Kakaei Lafdani et al., 2013; Malik et al., 2019; Rashidi et al., 2016). It is seen that the parameters Rt, Rt1, Qt1, and Qt2 have significant effect on output (Qt). The significantly effective parameters are considered important based on the maximum value of gamma (Г), standard error (SE), and Vratio statistics for further use in the development of model and prediction of runoff in the study area. Afterward, the five different input combinations were developed using the four effective input parameters (Table 12.3). The GT was again applied to these five different combinations, and the optimal one was selected based on the minimum score of Г, SE, and Vratio (Malik et al., 2019; Noori et al., 2011). The results of GT for five combinations of the effective input parameters are summarized in Table 12.4, which revealed that the combination of Rt, Rt1, Qt1, and Qt2 (Model-5) had the lowest value of Г ¼ 17514.14,

TABLE 12.3 Different input combinations for hybrid artificial intelligence models in the study area. SVR-WOA/SVR-GWO/MLP-GWO/MLP-WOA/MLR Input variables

1

2

3

4

5

Rt

O

O

O

O

O

O

O

O

O

Rt1

O

Qt1

O O

Qt2

O

O indicates the input variables for different combinations.

TABLE 12.4 Results of gamma test (GT) on different input combinations in the study area. GT statistics Various input combinations

Г

SE

Vratio

Mask

Rt

37,785.58

23,345.43

0.84

1000

Rt, Rt1

39,995.49

4,700.07

0.89

1100

Rt, Rt1, Qt1

22,107.16

2,684.42

0.49

1110

Rt, Rt1, Qt2

27,139.02

2,343.97

0.60

1101

Rt, Rt1, Qt1, Qt2

17,514.14

2,088.61

0.39

1111

SE, standard error; Г, gamma.

Hybrid artificial intelligence models Chapter | 12

321

SE ¼ 2088.61, and Vratio ¼ 0.39 with Mask of 1111 (indicates incorporation of the four input parameters). Thus, the optimal combination, i.e., model-5, was used for daily runoff prediction in Naula watershed.

12.4.2 Predicting daily runoff using hybrid AI models The selected hybrid AI models, i.e., SVR-WOA-5, SVR-GWO-5, MLP-GWO5, MLP-WOA-5, and MLR-5, were trained using data for the period from June 2000 to September 2003 and were tested using the data for the period from June 2004 to September 2004 for daily runoff prediction in the study area. The performance of all the five models was assessed using statistical and visual performance evaluation indicators. Values of the RMSE, PCC, and WI indicators during training and testing periods are summarized in Table 12.5. It is seen that the SVR-WOA-5, SVR-GWO-5, MLP-GWO-5, and MLP-WOA-5 models yielded RMSE of 94.498, 109.949, 104.414, and 104.339 m3/s, respectively; PCC of 0.860, 0.801, 0.816, and 0.816, respectively; and WI of 0.911, 0.870, 0.890, and 0.890, respectively, for the training period. On the other hand, for the testing period, the SVR-WOA-5, SVR-GWO-5, MLPGWO-5, MLP-WOA-5, and MLR-5 models yielded RMSE of 223.046, 228.557, 231.877, 246.718, and 254.146 m3/s, respectively; PCC of 0.733, 0.702, 0.661, 0.609, and 0.593, respectively; and WI of 0.755, 0.744, 0.759, 0.736, and 0.733, respectively. Thus, it is clearly revealed that the SVR-WOA5 model performed the best during testing period with the lower values of RMSE and higher values of COE, PCC, and WI than that of the SVR-GWO-5, MLP-GWO-5, MLP-WOA-5, and MLR-5 models. The regression equation of MLR-5 model with intercepts and regression coefficients can be written as Qt ¼ 41:47 þ 2:06Rt þ 0:19Rt1 þ 0:71Qt1 þ 0:08Qt2

(12.38)

TABLE 12.5 Values of performance evaluation indicators for hybrid artificial intelligence models in the study area. Training Model

3

RMSE (m /s)

Testing 3

PCC

WI

RMSE (m /s)

PCC

WI

SVR-WOA-5

94.498

0.860

0.911

223.046

0.733

0.755

SVR-GWO-5

109.949

0.801

0.870

228.557

0.702

0.744

MLP-GWO-5

104.414

0.816

0.890

231.877

0.661

0.759

MLP-WOA-5

104.339

0.816

0.890

246.718

0.609

0.736

MLR-5

108.871

0.797

0.877

254.146

0.593

0.733

PCC, Pearson correlation coefficient; RMSE, root mean square error; WI, Willmott index.

322 Advances in Streamflow Forecasting

Temporal variation of observed daily runoff as well as predicted daily runoff yielded by the SVR-WOA-5, SVR-GWO-5, MLP-GWO-5, MLPWOA-5, and MLR-5 models for the testing period is shown through line diagrams (left side) in Fig. 12.4AeE. The observed and predicted runoff are also plotted as scatter diagrams (right side) on 1:1 plot in Fig. 12.4AeE. It is observed that the regression and 1:1 line are close to each other in the case of every model; however, both the lines are much closer for the SVR-WOA-5 model with coefficient of determination (R2) value of 0.5367 (Fig. 12.4A). Thus, based on the statistical and visual performance evaluation indicators, the model (best to worst) could be placed in a hierarchical order from the best to the worst as SVR-WOA-5>SVR-GWO-5>MLP-GWO-5>MLP-WOA5>MLR-5. Spatial pattern of observed daily runoff and predicted daily runoff yielded by the SVR-WOA-5, SVR-GWO-5, MLP-GWO-5, MLP-WOA-5, and MLR-5 models for the testing period is shown in Fig. 12.5 through Taylor diagram, which has the ability to emphasize on the accuracy and efficiency of the different models based on the observed runoff values. Taylor diagram exhibits SD (the radial distances from the origin to the points are proportional to the pattern of standard deviations), correlation coefficient (CC: the azimuth positions give the correlation coefficient between the two fields), and RMSE (the round dot lines measure the distance from the reference point and, as a consequence of the relationship shown in Fig. 12.5, indicate the RMSE) in the single frame through polar style. It is clearly seen that the SVR-WOA-5 model provides the lower value of RMSE (w114 m3/s), SD (w170 m3/s), and higher value of CC as 0.730 (predicted runoff values are closer to the observed runoff values) than that of the SVR-GWO-5, MLP-GWO-5, MLP-WOA-5, and MLR-5 models. In other words, the test field (i.e., SVR-WOA-5) is close to the reference field (i.e., observed). Hence, the SVR-WOA-5 model with Rt, Rt-1, Qt-1, and Qt-2 inputs can be used for predicting the daily runoff in the study area. Furthermore, the prediction accuracy of the SVR-GWO-5, MLP-GWO-5, MLP-WOA-5, and MLR-5 models with respect to RMSE values was found to decrease by 2%, 4%, 10%, and 12%, respectively, on comparing with RMSE values of the SVR-WOA-5 model. The approach of developing hybrid AI models described in this chapter and demonstrated through a case study results in a standard and truthful intelligent system that can be used for prediction of daily runoff values in the study area, which is extremely valuable for the water resources planners and managers. The results of this study revealed the successful application of hybrid AI models for the efficient prediction of daily runoff and ranked the models in order of their performance from the best to the worst as SVR-WOA-5>SVRGWO-5>MLP-GWO-5>MLP-WOA-5>MLR-5 models for the study area. The results of this study showed agreement with the findings of the previous study. Tikhamarine et al. (2019b) compared the effectiveness of the wavelet

Hybrid artificial intelligence models Chapter | 12

323

FIGURE 12.4 Line (left) and scatter (right) plots among observed and predicted runoff values yielded by (A) SVR-WOA-5, (B) SVR-GWO-5, (C) MLP-GWO-5, (D) MLP-WOA-5, and (E) MLR-5 models during the testing period in the study area.

324 Advances in Streamflow Forecasting

FIGURE 12.5 Taylor diagram of observed and predicted runoff values by SVR-WOA-5, SVRGWO-5, MLP-GWO-5, MLP-WOA-5, and MLR-5 models for the testing period in the study area.

support vector regressionegrey wolf optimizer (WSVR-GWO) model against SVR-PSO, SVR-shuffled complex evolution, and SVR-MVO models for predicting monthly streamflow from two catchments located in Algeria. They also found the superior performance of the WSVR-GWO model over the other models.

12.5 Conclusions The purpose of this chapter was to present a description of hybrid AI models developed by coupling of SVR and MLP neural network model with two metaheuristic algorithms, i.e., WOA and GWO. The chapter also presents a case study which aimed at evaluating the effectiveness of the developed hybrid AI models, i.e., SVR-WOA, SVR-GWO, MLP-GWO, and MLP-WOA against the MLR model to predict daily runoff in Naula watershed located in the upper RRC of Uttarakhand State in India. The optimal variables and appropriate input combination for the SVR-WOA, SVR-GWO, MLP-GWO, MLP-WOA, and MLR models were nominated by utilizing the GT before model

Hybrid artificial intelligence models Chapter | 12

325

development and evaluation. The runoff estimates obtained from these models were compared with the observed values of runoff using performance evaluation criteria of RMSE, PCC, and WI and also using visual inspection of a few diagrams. The comparative results revealed that the hybrid AI models performed better than the MLR model. Specifically, the performance of the SVRWOA model was superior to other hybrid AI models in predicting daily runoff. Furthermore, the results of this study revealed the potential capability of GT in determining the significant input variables and less time-consuming optimal combination of input parameters for daily runoff prediction in the study area. In addition, the proposed methodology, GT combined with the hybrid AI models can be used for forecasting of other hydrometeorological variables using the historical time series data.

References Adnan, R.M., Yuan, X., Kis¸i, O., Adnan, M., Mehmood, A., 2018. Stream flow forecasting of poorly gauged mountainous watershed by least square support vector machine, fuzzy genetic algorithm and M5 model tree using climatic data from nearby station. Water Resour. Manag. 32, 4469e4486. https://doi.org/10.1007/s11269-018-2033-2. Ahani, A., Shourian, M., Rahimi Rad, P., 2018. Performance assessment of the linear, nonlinear and nonparametric data driven models in river flow forecasting. Water Resour. Manag. 32, 383e399. https://doi.org/10.1007/s11269-017-1792-5. Al-Zoubi, A.M., Faris, H., Alqatawna, J., Hassonah, M.A., 2018. Evolving Support Vector Machines using Whale Optimization Algorithm for spam profiles detection on online social networks in different lingual contexts. Knowl. Base Syst. 153, 91e104. https://doi.org/ 10.1016/j.knosys.2018.04.025. Ali, S., Shahbaz, M., 2020. Streamflow forecasting by modeling the rainfallestreamflow relationship using artificial neural networks. Model. Earth Syst. Environ. 6, 1645e1656. https:// doi.org/10.1007/s40808-020-00780-3. Aljarah, I., Faris, H., Mirjalili, S., 2018. Optimizing connection weights in neural networks using the whale optimization algorithm. Soft Comput. 22, 1e15. https://doi.org/10.1007/s00500016-2442-1. Ashrafzadeh, A., Malik, A., Jothiprakash, V., Ghorbani, M.A., Biazar, S.M., 2020. Estimation of daily pan evaporation using neural networks and meta-heuristic approaches. ISH J. Hydraul. Eng. 26, 421e429. https://doi.org/10.1080/09715010.2018.1498754. Ateeq-ur-Rauf, Ghumman, A.R., Ahmad, S., Hashmi, H.N., 2018. Performance assessment of artificial neural networks and support vector regression models for stream flow predictions. Environ. Monit. Assess. 190, 1e20. https://doi.org/10.1007/s10661-018-7012-9. Azamathulla, H.M., Wu, F.-C., 2011. Support vector machine approach for longitudinal dispersion coefficients in natural streams. Appl. Soft Comput. 11, 2902e2905. https://doi.org/10.1016/ j.asoc.2010.11.026. Chang, C.C., Lin, C.J., 2011. LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1e27. https://doi.org/10.1145/1961189.1961199. Chang, J., Zhang, H., Wang, Y., Zhang, L., 2017. Impact of climate change on runoff and uncertainty analysis. Nat. Hazards 88, 1113e1131. https://doi.org/10.1007/s11069-017-2909-0.

326 Advances in Streamflow Forecasting Choubin, B., Malekian, A., 2017. Combined gamma and M-test-based ANN and ARIMA models for groundwater fluctuation forecasting in semiarid regions. Environ. Earth Sci. 76, 1e10. https://doi.org/10.1007/s12665-017-6870-8. Dai, S., Niu, D., Li, Y., 2018. Daily peak load forecasting based on complete ensemble empirical mode decomposition with adaptive noise and support vector machine optimized by modified grey wolf optimization algorithm. Energies 11, 1e25. https://doi.org/10.3390/en11010163. Danandeh Mehr, A., 2018. An improved gene expression programming model for streamflow forecasting in intermittent streams. J. Hydrol. 563, 669e678. https://doi.org/10.1016/ j.jhydrol.2018.06.049. Dehghani, M., Riahi-Madvar, H., Hooshyaripor, F., Mosavi, A., Shamshirband, S., Zavadskas, E., Chau, K., 2019. Prediction of hydropower generation using grey wolf optimization adaptive neuro-fuzzy inference system. Energies 12, 1e20. https://doi.org/10.3390/en12020289. Diop, L., Bodian, A., Djaman, K., Yaseen, Z.M., Deo, R.C., El-shafie, A., Brown, L.C., 2018. The influence of climatic inputs on stream-flow pattern forecasting: case study of Upper Senegal River. Environ. Earth Sci. 77, 1e13. https://doi.org/10.1007/s12665-018-7376-8. Elshorbagy, A., Corzo, G., Srinivasulu, S., Solomatine, D.P., 2010. Experimental investigation of the predictive capabilities of data driven modeling techniques in hydrology - Part 1: concepts and methodology. Hydrol. Earth Syst. Sci. 14, 1931e1941. https://doi.org/10.5194/hess-141943-2010. Faris, H., Aljarah, I., Mirjalili, S., 2016. Training feedforward neural networks using multi-verse optimizer for binary classification problems. Appl. Intell. 45, 322e332. https://doi.org/ 10.1007/s10489-016-0767-1. Ghorbani, M.A., Khatibi, R., Karimi, V., Yaseen, Z.M., Zounemat-Kermani, M., 2018. Learning from multiple models using artificial intelligence to improve model prediction accuracies: application to river flows. Water Resour. Manag. 32, 4201e4215. https://doi.org/10.1007/ s11269-018-2038-x. Granata, F., Gargano, R., de Marinis, G., 2016. Support vector regression for rainfall-runoff modeling in urban drainage: a comparison with the EPA’s storm water management model. Water 8, 1e18. https://doi.org/10.3390/w8030069. Hadi, S.J., Tombul, M., 2018. Forecasting daily streamflow for basins with different physical characteristics through data-driven methods. Water Resour. Manag. 32, 3405e3422. https:// doi.org/10.1007/s11269-018-1998-1. Hussain, D., Khan, A.A., 2020. Machine learning techniques for monthly river flow forecasting of Hunza River, Pakistan. Earth Sci. India 13, 939e949. https://doi.org/10.1007/s12145-02000450-z. Jaiswal, R.K., Ali, S., Bharti, B., 2020. Comparative evaluation of conceptual and physical rainfallerunoff models. Appl. Water Sci. 10, 1e14. https://doi.org/10.1007/s13201-019-11226. Jiang, Z., Li, R., Li, A., Ji, C., 2018. Runoff forecast uncertainty considered load adjustment model of cascade hydropower stations and its application. Energy 158, 693e708. https://doi.org/ 10.1016/j.energy.2018.06.083. Kakaei Lafdani, E., Moghaddam Nia, A., Ahmadi, A., 2013. Daily suspended sediment load prediction using artificial neural networks and support vector machines. J. Hydrol. 478, 50e62. https://doi.org/10.1016/j.jhydrol.2012.11.048. Kennedy, J., Eberhart, R., 1995. Particle swarm optimization. In: Proceedings of ICNN’95 - International Conference on Neural Networks. IEEE, pp. 1942e1948.

Hybrid artificial intelligence models Chapter | 12

327

Kis¸i, O., 2015. Streamflow forecasting and estimation using least square support vector regression and adaptive neuro-fuzzy embedded fuzzy c-means clustering. Water Resour. Manag. 29, 5109e5127. https://doi.org/10.1007/s11269-015-1107-7. Kis¸i, O., Cimen, M., 2011. A wavelet-support vector machine conjunction model for monthly streamflow forecasting. J. Hydrol. 399, 132e140. https://doi.org/10.1016/ j.jhydrol.2010.12.041. LV, Z., Zuo, J., Rodriguez, D., 2020. Predicting of runoff using an optimized SWAT-ANN: a case study. J. Hydrol. Reg. Stud. 29, 100688. https://doi.org/10.1016/j.ejrh.2020.100688. Mafarja, M., Mirjalili, S., 2018. Whale optimization approaches for wrapper feature selection. Appl. Soft Comput. 62, 441e453. https://doi.org/10.1016/j.asoc.2017.11.006. Mafarja, M.M., Mirjalili, S., 2017. Hybrid Whale Optimization Algorithm with simulated annealing for feature selection. Neurocomputing 260, 302e312. https://doi.org/10.1016/ j.neucom.2017.04.053. Malik, A., Kumar, A., 2015. Pan evaporation simulation based on daily meteorological data using soft computing techniques and multiple linear regression. Water Resour. Manag. 29, 1859e1872. https://doi.org/10.1007/s11269-015-0915-0. Malik, A., Kumar, A., Kim, S., Kashani, M.H., Karimi, V., Sharafati, A., Ghorbani, M.A., Al-Ansari, N., Salih, S.Q., Yaseen, Z.M., Chau, K.-W., 2020. Modeling monthly pan evaporation process over the Indian central Himalayas: application of multiple learning artificial intelligence model. Eng. Appl. Comput. Fluid Mech. 14, 323e338. https://doi.org/10.1080/ 19942060.2020.1715845. Malik, A., Kumar, A., Kis¸i, O., 2017. Monthly pan-evaporation estimation in Indian central Himalayas using different heuristic approaches and climate based models. Comput. Electron. Agric. 143, 302e313. https://doi.org/10.1016/j.compag.2017.11.008. Malik, A., Kumar, A., Kis¸i, O., 2018. Daily Pan evaporation estimation using heuristic methods with gamma test. J. Irrigat. Drain. Eng. 144, 04018023. https://doi.org/10.1061/(ASCE) IR.1943-4774.0001336. Malik, A., Kumar, A., Kis¸i, O., Shiri, J., 2019. Evaluating the performance of four different heuristic approaches with Gamma test for daily suspended sediment concentration modeling. Environ. Sci. Pollut. Res. 26, 22670e22687. https://doi.org/10.1007/s11356-019-05553-9. Malik, A., Kumar, A., Piri, J., 2017. Daily suspended sediment concentration simulation using hydrological data of Pranhita River Basin, India. Comput. Electron. Agric. 138, 20e28. https://doi.org/10.1016/j.compag.2017.04.005. McClelland, J.L., Rumelhart, D.E., 1989. Explorations in Parallel Distributed Processing: A Handbook of Models, Programs, and Exercises. MIT Press, Cambridge. Mirjalili, S., 2015. How effective is the Grey Wolf optimizer in training multi-layer perceptrons. Appl. Intell. 43, 150e161. https://doi.org/10.1007/s10489-014-0645-7. Mirjalili, S., Aljarah, I., Mafarja, M., Heidari, A.A., Faris, H., 2020. Grey wolf optimizer: theory, literature review, and application in computational fluid dynamics problems. In: Studies in Computational Intelligence, pp. 87e105. Mirjalili, S., Lewis, A., 2016. The whale optimization algorithm. Adv. Eng. Software 95, 51e67. https://doi.org/10.1016/j.advengsoft.2016.01.008. Mirjalili, S., Mirjalili, S.M., Hatamlou, A., 2016. Multi-Verse Optimizer: a nature-inspired algorithm for global optimization. Neural Comput. Appl. 27, 495e513. https://doi.org/10.1007/ s00521-015-1870-7. Mirjalili, S., Mirjalili, S.M., Lewis, A., 2014. Grey wolf optimizer. Adv. Eng. Software 69, 46e61. https://doi.org/10.1016/j.advengsoft.2013.12.007.

328 Advances in Streamflow Forecasting Noori, R., Karbassi, A.R., Moghaddamnia, A., Han, D., Zokaei-Ashtiani, M.H., Farokhnia, A., Gousheh, M.G., 2011. Assessment of input variables determination on the SVM model performance using PCA, Gamma test, and forward selection techniques for monthly stream flow prediction. J. Hydrol. 401, 177e189. https://doi.org/10.1016/j.jhydrol.2011.02.021. Nourani, V., Davanlou Tajbakhsh, A., Molajou, A., Gokcekus, H., 2019. Hybrid wavelet-M5 model tree for rainfall-runoff modeling. J. Hydrol. Eng. ASCE 24, 04019012. https://doi.org/10.1061/ (ASCE)HE.1943-5584.000177. Piri, J., Amin, S., Moghaddamnia, A., Keshavarz, A., Han, D., Remesan, R., 2009. Daily pan evaporation modeling in a hot and dry climate. J. Hydrol. Eng. ASCE 14, 803e811. https:// doi.org/10.1061/(ASCE)HE.1943-5584.0000056. Pradhan, P., Tingsanchali, T., Shrestha, S., 2020. Evaluation of soil and water assessment tool and artificial neural network models for hydrologic simulation in different climatic regions of asia. Sci. Total Environ. 701, 134308. https://doi.org/10.1016/j.scitotenv.2019.134308. Qi, Y., Zhou, Z., Yang, L., Quan, Y., Miao, Q., 2019. A decomposition-ensemble learning model based on LSTM neural network for daily reservoir inflow forecasting. Water Resour. Manag. 33, 4123e4139. https://doi.org/10.1007/s11269-019-02345-1. Raihan, F., Beaumont, L.J., Maina, J., Saiful Islam, A., Harrison, S.P., 2020. Simulating streamflow in the Upper Halda Basin of southeastern Bangladesh using SWAT model. Hydrol. Sci. J. 65, 138e151. https://doi.org/10.1080/02626667.2019.1682149. Rashidi, S., Vafakhah, M., Lafdani, E.K., Javadi, M.R., 2016. Evaluating the support vector machine for suspended sediment load forecasting based on gamma test. Arab. J. Geosci. 9, 1e15. https://doi.org/10.1007/s12517-016-2601-9. Rezaie-Balf, M., Naganna, S.R., Kis¸i, O., El-Shafie, A., 2019. Enhancing streamflow forecasting using the augmenting ensemble procedure coupled machine learning models: case study of Aswan High Dam. Hydrol. Sci. J. 64, 1629e1646. https://doi.org/10.1080/ 02626667.2019.1661417. Shoaib, M., Shamseldin, A.Y., Melville, B.W., Khan, M.M., 2016. A comparison between wavelet based static and dynamic neural network approaches for runoff prediction. J. Hydrol. 535, 211e225. https://doi.org/10.1016/j.jhydrol.2016.01.076. Sihag, P., Tiwari, N.K., Ranjan, S., 2018. Support vector regression-based modeling of cumulative infiltration of sandy soil. ISH J. Hydraul. Eng. 26, 1e7. https://doi.org/10.1080/ 09715010.2018.1439776. Smola, A.J., Scho¨lkopf, B., 2004. A tutorial on support vector regression. Stat. Comput. 14, 199e222. https://doi.org/10.1023/B:STCO.0000035301.49549.88. Stefa´nsson, A., Koncar, N., Jones, A.J., 1997. A note on the gamma test. Neural Comput. Appl. 5, 131e133. https://doi.org/10.1007/BF01413858. Tabari, H., Sabziparvar, A.-A., Ahmadi, M., 2011. Comparison of artificial neural network and multivariate linear regression methods for estimation of daily soil temperature in an arid region. Meteorol. Atmos. Phys. 110, 135e142. https://doi.org/10.1007/s00703-010-0110-z. Taylor, K.E., 2001. Summarizing multiple aspects of model performance in a single diagram. J. Geophys. Res. Atmos. 106, 7183e7192. https://doi.org/10.1029/2000JD900719. Tikhamarine, Y., Malik, A., Kumar, A., Souag-Gamane, D., Kis¸i, O., 2019. Estimation of monthly reference evapotranspiration using novel hybrid machine learning approaches. Hydrol. Sci. J. 64, 1824e1842. https://doi.org/10.1080/02626667.2019.1678750. Tikhamarine, Y., Malik, A., Souag-Gamane, D., Kis¸i, O., 2020. Artificial intelligence models versus empirical equations for modeling monthly reference evapotranspiration. Environ. Sci. Pollut. Res. 27, 30001e30019. https://doi.org/10.1007/s11356-020-08792-3.

Hybrid artificial intelligence models Chapter | 12

329

Tikhamarine, Y., Souag-Gamane, D., Kis¸i, O., 2019. A new intelligent method for monthly streamflow prediction: hybrid wavelet support vector regression based on grey wolf optimizer (WSVReGWO). Arab. J. Geosci. 12, 1e20. https://doi.org/10.1007/s12517-019-4697-1. Vapnik, V.N., 1995. The Nature of Statistical Learning Theory. Springer, New York, p. 314. Viccione, G., Guarnaccia, C., Mancini, S., Quartieri, J., 2020. On the use of ARIMA models for short-term water tank levels forecasting. Water Supply 20, 787e799. https://doi.org/10.2166/ ws.2019.190. Wang, W.-C., Chau, K.-W., Cheng, C.-T., Qiu, L., 2009. A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series. J. Hydrol. 374, 294e306. https://doi.org/10.1016/j.jhydrol.2009.06.019. Willmott, C.J., 1981. On the validation of models. Phys. Geogr. 2, 184e194. https://doi.org/ 10.1080/02723646.1981.10642213. Yaseen, Z.M., Awadh, S.M., Sharafati, A., Shahid, S., 2018. Complementary data-intelligence model for river flow simulation. J. Hydrol. 567, 180e190. https://doi.org/10.1016/ j.jhydrol.2018.10.020. Yaseen, Z.M., Ebtehaj, I., Bonakdari, H., Deo, R.C., Danandeh Mehr, A., Mohtar, W.H.M.W., Diop, L., El-shafie, A., Singh, V.P., 2017. Novel approach for streamflow forecasting using a hybrid ANFIS-FFA model. J. Hydrol. 554, 263e276. https://doi.org/10.1016/ j.jhydrol.2017.09.007. Yaseen, Z.M., El-Shafie, A., Afan, H.A., Hameed, M., Mohtar, W.H.M.W., Hussain, A., 2016. RBFNN versus FFNN for daily river flow forecasting at Johor River, Malaysia. Neural Comput. Appl. 27, 1533e1542. https://doi.org/10.1007/s00521-015-1952-6. Yaseen, Z.M., El-shafie, A., Jaafar, O., Afan, H.A., Sayl, K.N., 2015. Artificial intelligence based models for stream-flow forecasting: 2000e2015. J. Hydrol. 530, 829e844. https://doi.org/ 10.1016/J.JHYDROL.2015.10.038. Yaseen, Z.M., Jaafar, O., Deo, R.C., Kis¸i, O., Adamowski, J., Quilty, J., El-Shafie, A., 2016. Stream-flow forecasting using extreme learning machines: a case study in a semi-arid region in Iraq. J. Hydrol. 542, 603e614. https://doi.org/10.1016/j.jhydrol.2016.09.035. Yaseen, Z.M., Kis¸i, O., Demir, V., 2016. Enhancing long-term streamflow forecasting and predicting using periodicity data component: application of artificial intelligence. Water Resour. Manag. 30, 4125e4151. https://doi.org/10.1007/s11269-016-1408-5. Yaseen, Z.M., Sulaiman, S.O., Deo, R.C., Chau, K.-W., 2019. An enhanced extreme learning machine model for river flow forecasting: State-of-the-art, practical applications in water resource engineering area and future research direction. J. Hydrol. 569, 387e408. https:// doi.org/10.1016/j.jhydrol.2018.11.069.

Chapter 13

Flood forecasting and error simulation using copula entropy method Lu Chen1, Vijay P. Singh2 1 School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan, Hubei, China; 2Department of Biological & Agricultural Engineering and Zachry Department of Civil & Environmental Engineering, Texas A&M University, College Station, TX, United States

13.1 Introduction Floods are the most common natural hazards and third most damaging disaster globally after severe storms and earthquakes. Extreme flood events have been and continue to be one of the most important natural hazards responsible for deaths and economic losses. In China, a number of flood events have occurred over the last 100 years, which caused great economic losses (Zhang and Hall, 2004). For example, floods in the Yangtze River (Chang Jiang) basin in central and eastern China have occurred periodically and often have caused considerable destruction of property and loss of life. Among the most major flood events are those of 1870, 1931, 1954, 1998, and 2010. In 1998, the Yangtze River basin suffered from tremendous floodingdthe largest flood since 1954, which led to the economic loss of 166 billion Chinese Yuan (or nearly 20 billion US $) (Yin and Li, 2001). In the United States, the Mississippi River basin suffered from 36 floods in the 20th century, of which flood of 1927 was the worst, causing nearly 500 deaths and $1 billion in losses (Ward, 1979). With the use of precipitation data, a rainfall-runoff model is used to forecast streamflow or river flow for periods ranging from a few hours to days ahead, depending on the size of the watershed or river basin. The forecast values are of great importance for the reduction of disasters and the optimal use of water resources (Fig. 13.1). In literature, many types of models, such as conceptual hydrological models and distributed hydrological models, were used for flood forecasting. A data-driven technique that has gained significant attention for its effectiveness in approximating function characteristics is artificial neural network Advances in Streamflow Forecasting. https://doi.org/10.1016/B978-0-12-820673-7.00011-1 Copyright © 2021 Elsevier Inc. All rights reserved.

331

332 Advances in Streamflow Forecasting

FIGURE 13.1 The process of the copula entropy method.

(ANN) modeling (de Vos and Rientjes, 2005; Kasiviswanathan and Sudheer, 2013). Many studies on streamflow prediction have shown that the ANN is superior to traditional regression techniques and time series models, including autoregressive and autoregressive moving average models (Raman and Sunilkumar, 1995; Jain et al., 1999; Thirumalaiah and Deo, 2000; Abrahart and See, 2002; Castellano-Me´ndeza et al., 2004). Hsu et al. (1995) showed that the ANN model provided a better representation of the rainfall-runoff relationship in the medium-size Leaf River basin near Collins, Mississippi, than the time series model or the conceptual SAC-SMA (Sacramento soil moisture accounting) model. Comparing the effectiveness of an ANN flood forecasting model with the simple linear model, the seasonally based linear perturbation model, and the nearest neighbor linear perturbation model, Shamseldin (1997) concluded that the ANN model provided more accurate discharge forecasts than the other models did in six catchments. Birikundavyi et al. (2002) investigated efficacy of the ANN models for daily streamflow prediction and also showed that the ANN outperformed the AR model coupled with a Kalman filter. Therefore, the ANN models have proved to be a potential tool for flood forecasting. However, in most ANN applications to flood forecasting, little attention has been given to the selection of model input (Maier and Dandy, 2000).

Flood forecasting and error simulation using copula Chapter | 13

333

In general, all potential input variables are not equally informative as some of these may be correlated, noisy, or have no significant relationship with the output variable being modeled (Bowden et al., 2005a). Also, use of a large number of inputs to the ANN models and relying on the models to determine the critical model inputs usually increases the model size (Maier and Dandy, 2000). This also has a number of other disadvantages, such as decreasing processing speed and increasing the amount of data required to efficiently estimate the connection weights (Lachtermacher and Fuller, 1994). Bowden et al. (2005a) presented a review of methods used for input determination in the ANN applications in water resources. Their review revealed that the inputs for the ANN modeling are the most commonly determined using three methods: (i) methods that rely on the use of a priori knowledge of the system being modeled, (ii) methods based on dependence structures between variables, and (iii) methods that utilize a heuristic approach. A priori knowledge method that depends on the experts’ knowledge is subjective and case-dependent. For the dependence measuring method, there are several traditional methods to describe the dependence structure between variables. The most commonly used one is linear relation, which mainly exists in regression models. This kind of linear relation is usually measured by covariance and correlation coefficient corresponding to the multivariate normal distribution (Xu, 2005; Zhao and Lin, 2011). The drawbacks of the linear correlation method can be summarized as follows: (i) it only applies to linear correlation; (ii) it tends to focus on the degree of dependence; and (iii) it ignores the structure of dependence (Zhao and Lin, 2011). Thus, two important measures of dependence (concordance), known as Kendall’s tau and Spearman’s rho, provide the best alternatives to the linear correlation coefficient (LCC) as a measure of dependence for non-Gaussian distributions, for which the LCC is inappropriate and often misleading. The disadvantage of rankbased correlation coefficient is that there is a loss of information when the data are converted to ranks; if the data are normally distributed, the rank-based correlation coefficient is less powerful than the Pearson correlation coefficient (Gauthier, 2001). In case of a heuristic method, various ANN models are trained using different subsets of inputs. The main disadvantage of this method is its dependence on trial-and-error procedure without any guarantee as such to find the globally best subset. Another disadvantage of stepwise heuristic method is that it is computationally intensive (Bowden et al., 2005a). An alternative method for choosing the most influential input variables is based on the entropy theory in which mutual information (MI) method has been successfully employed as a nonlinear measure of inference among variables (e.g., Khan et al., 2006; Molini et al., 2006; Ng et al., 2007; Hejazi et al., 2008; Alfonso et al., 2010, 2013; Singh, 2013, 2014, 2015, 2016). The MI, defined as the difference between marginal entropy and conditional entropy, is a measure of the amount of information that one random variable contains or explains about another random variable (Li, 1990; Khan et al., 2006; Molini

334 Advances in Streamflow Forecasting

et al., 2006). It can be used to indicate the dependence or independence between variables. If the two variables are independent, the MI between them is zero. If the two are strongly dependent, e.g., one is a function of another, the MI between them is large (Li, 1990; Singh, 2013, 2014, 2015, 2016). The use of MI method has become popular in several fields of science to measure the dependence between variables (Alfonso et al., 2010). For example, using the MI method, Harmancioglu and Yevjevich (1987) analyzed three types of information transfer among river points in the Esencay River, Turkey. The MI has also been used for network design (Krstanovic and Singh, 1992a,b; Alfonso et al., 2010; Singh, 2013, 2014, 2015, 2016). Some of the advantages of this method have been reported widely (e.g., Li, 1990; Singh, 2000, 2013, 2014, 2015, 2016; Steuer et al., 2002). The advantages of MI include the following: (i) it is a nonlinear measure of statistical dependence based on information theory (Steuer, 2006); and (ii) it is a nonparametric method, which makes no assumptions about the functional form (Gaussian or non-Gaussian) of the statistical distribution that produced the data. In this chapter, the entropy-based method was used for the identification of influential inputs. There is also a disadvantage of using the MI method in selecting dominant inputs for the ANN modeling. Although a candidate model input might have a strong relationship with the model output, this information might be redundant if the same information is already provided by another input (Fernando et al., 2009). Bowden et al. (2005a) and Fernando et al. (2009) pointed out that input selection algorithms should cater to the redundancy in candidate model inputs. In order to fulfill this requirement, Sharma (2000) proposed a partial mutual information (PMI) criterion as the basis for identifying more than one predictor in a stepwise manner. In a review of approaches used to select inputs for the ANN models, Bowden et al. (2005a) concluded that the PMI algorithm proposed by Sharma (2000) was superior to methods commonly used to determine inputs for the ANN models, as it is model-free and uses a nonlinear measure of dependence (MI). In the PMI algorithm, the nonparametric kernel methods were used to characterize the joint probability distribution of the variables involved. Fernando et al. (2009) modified the PMI input selection algorithm in order to increase its computational efficiency at a maintained accuracy. They introduced the average shifted histograms as an alternative approach to kernel-based methods for the estimation of MI. There are several disadvantages of the PMI algorithm. First, hydrological events, such as rainfall and runoff, are continuous processes. However, the PMI methods usually use the discrete version to calculate the PMI. Therefore, a method of the PMI for the continuous variables is needed. Second, these methods need estimates of both marginal and joint probability distributions. For a d-dimensional multivariate distribution, it is difficult to obtain the joint probability distributions. In order to overcome these problems, this chapter proposes a new method, which calculates the PMI values directly based on copula entropy (CE).

Flood forecasting and error simulation using copula Chapter | 13

335

Ma and Sun (2008) took into account the copula function and entropy theory together and introduced the concept of CE, which is defined as the entropy of copula function and is related to the joint entropy, marginal entropy, and MI. The CE concept has received a significant attention in recent years. Li and Zheng (2016) proved the effectiveness of the CE for probabilistic modeling of the flood events of two hydrological gauges. Leandro et al. (2020) defined a CEebased model for the joint simulation of monthly streamflow and wind speed time series to evaluate the potential integration of hydro and wind energy sources. Wang et al. (2018) found an alternative method for hydrological prediction by combining CE with wavelet neural network. Calsaverini and Vicente (2009) yielded a simple test of Gaussianity through the estimate of the information excess and suggested a method for copula identification based on information content matching by connections between copula and entropy methods, which involved CE. Zhao and Lin (2011) applied CE models with two and three variables to measure the LCC and the MI in stock markets. The advantages of CE are summarized as follows: (i) it makes no assumptions about the marginal distributions and can be used for higher dimensions, and (ii) the MI is obtained from the calculation of the CE instead of the marginal or joint entropy, which estimates the MI more directly and avoids the accumulation of systematic bias. It is evidenced from the literature review that until now the CE method has not been widely used in hydrology. The objective of this chapter is to present development of a method for input selection for the ANN model based on the CE theory. The theoretical background, including a discussion and presentation of concepts and formulas of the ANN model, entropy, and copulas, is introduced. The new method, based on the CE theory to determine the inputs of an ANN model, is presented. Furthermore, applicability and the accuracy of the new CE method are tested through two case studies. Finally, the proposed input selection method is compared with the traditional methods.

13.2 Background 13.2.1 Artificial neural networks In statistical modeling, nonlinear dynamic processes are approximated by an empirical regression model of the general form (May et al., 2008a): yðtÞ ¼ Fðyðt  1Þ; /yðt  pÞ; Qðt  1Þ; /; Qðt  qÞÞ

(13.1)

where F is a function, y is the model output predicted at time t, and p and q are the parameters denoting the model order or the number of lags. The model inputs comprise past observations (or lags) of y and Q where Q can be either rainfall or runoff at other stations. Function F is unknown, and the ANN is used to determine the form of F based on a set of representative data. In this study, three ANN models were used namely multilayer feedforward neural

336 Advances in Streamflow Forecasting

networks, radial basis function networks, and general regression neural network (GRNN) models. A brief overview of these ANN models is provided in other chapters of this book and further details can be found in Chen et al. (2014).

13.2.2 Entropy theory The Shannon entropy (Shannon, 1948) quantitatively measures the mean uncertainty associated with a probability distribution of a random variable and in turn with the random variable itself in concert with several consistency requirements (Kapur and Kesavan, 1992). The entropy of a random variable (r.v.) X can be expressed as (Chen and Guo, 2019): ZN HðXÞ ¼ 

f ðxÞlog f ðxÞ dx

(13.2)

0

where f(x) is the probability density function of variable X. In case study of this chapter, we focus on flood flow, and thus, the range of the variable is from 0 to infinite. Actually, the domain can be extended to any real number. The entropy expressed by the above equation defines the univariate continuous entropy or marginal entropy of X. The units of entropy are actually given by the base of the logarithm, being “nats” for base e, “docit” for base 10, and “bits” for base 2. The natural logarithm would be used hereafter in this chapter. For two random variables (r.v.s) X1 and X2 , the joint entropy can be expressed as (Chen and Guo, 2019) ZN ZN HðX1 ; X2 Þ ¼ 

f ðx1 ; x2 Þlog f ðx1 ; x2 Þdx1 dx2 0

0

ZN ZN ¼

f ðx1 ; x2 Þlog½f ðx2 jx1 Þf ðx1 Þdx1 dx2 0

0

ZN ZN ¼

ZN ZN f ðx1 ; x2 Þlog f ðx2 jx1 Þdx1 dx2 

0

0

f ðx1 ; x2 Þlog f ðx1 Þdx1 dx2 0

¼ HðX2 jX1 Þ 

0

ZN

ZN log f ðx1 Þdx1 0

f ðx1 ; x2 Þdx2 0

ZN ¼ HðX2 jX1 Þ 

f ðx1 Þlog f ðx1 Þdx1 0

¼ HðX2 jX1 Þ þ HðX1 Þ (13.3)

Flood forecasting and error simulation using copula Chapter | 13

337

Let X1 ; X2 ; $__ ; Xd denote the r.v.s. The multidimensional joint entropy can be expressed as (Chen and Guo, 2019) HðX1 ; X2 ; .; Xd Þ ZN ZN ¼ _: _ f ðx1 ; x2 ; .; xd Þ log½ f ðx1 ; x2 ; .; xd Þ dx1 dx2 . dxd 0

0

(13.4)

13.2.3 Copula function In application of univariate, bivariate, and multivariate entropy formulas, it is necessary to use univariate, bivariate, and multivariate distributions. The problem of specifying a probability model for dependent multivariate observations is simplified by expressing the corresponding d-dimensional joint cumulative distribution using a copula function (Salvadori and De Michele, 2010). Following Sklar (1959) and Nelsen (2006), if F1 ; F2 ; $$$; Fd (x1 ; x2 ; $$$; xd ) is a multivariate distribution function of d correlated random variables of X1 ; X2 ; $$$; Xd with respective marginal distributions (or margins) F1 ðx1 Þ; F2 ðx2 Þ; $$$; Fd ðxd Þ, then it is possible to write a d-dimensional cumulative distribution function (CDF) with univariate margins, F1 ðx1 Þ; F2 ðx2 Þ; $$$; Fd ðxd Þ, as follows (Salvadori and De Michele, 2010): Fðx1 ; x2 ; /; xd Þ ¼ CðF1 ðx1 Þ; F2 ðx2 Þ; /Fd ðxd ÞÞ ¼ Cðu1 ; /; ud Þ

(13.5)

where Fk ðxk Þ ¼ Uk for k ¼ 1, ., d, with Uk wUð0; 1Þ, and C is a function called copula. The copula function is capable of exhibiting the structure of dependence between two or more random variables. The function has recently emerged as a practical and efficient method for modeling the general dependence in multivariate data (e.g., Joe, 1997; Nelsen, 2006). The advantages of using copulas to model joint distributions are manifold: (i) flexibility in choosing arbitrary marginal and structure of dependence, (ii) extension to more than two variables, and (iii) separate analysis of marginal distributions and dependence structure (Salvadori et al., 2007; Serinaldi et al., 2009). Hydrological applications of copulas have surged in recent years (e.g., Bhuyan-Erhardt et al., 2019). For example, copulas have been used for rainfall frequency analysis (Liu et al., 2015; Wei and Song, 2018; Mesbahzadeh et al., 2019; Li et al., 2019), flood frequency analysis (Fu et al., 2014; Ozga-Zielinski et al., 2016; Durocher et al., 2018; Filipova et al., 2018), drought frequency analysis (Bazrafshan et al., 2020; Vergni et al., 2020; Saghafian and Sanginabadi, 2020), rainfall and flood events analysis (Jhong and Tung, 2018; Bevacqua et al., 2017; Liu et al., 2020), sea storm analysis (Hou et al., 2019), and other theoretical analyses of multivariate extreme problems (Hao et al., 2017;

338 Advances in Streamflow Forecasting

Liu et al., 2018; Qian et al., 2018). Detailed theoretical background and description for the use of copulas can be found in Nelsen (2006), Salvadori et al. (2007), and Zhang and Singh (2019). In literature, copulas have been proposed and described (Nelsen, 2006; Salvadori et al., 2007; Zhang and Singh, 2019), such as Archimedean and ellipse copulas. Of all the copula families, the Archimedean family is the most widely used in hydrological studies, as it is more easily constructed and applied regardless of the fact that the correlation among the hydrological variables is positive or negative (Zhang and Singh, 2006, 2019). The bivariate Archimedean copula has the simple algebraic form (Salvadori et al., 2007) Cðu; vÞ ¼ f½1 ½fðu1 Þ þ fðu2 Þ u; v ˛ I

(13.6)

where f is a specific function known as a generator of C. A large variety of copulas belong to the Archimedean family. Three oneparameter Archimedean copulas, including the Gumbel, Frank, and Clayton copulas, have been widely applied in frequency analysis (Favre et al., 2004; Zhang and Singh, 2006, 2019). Therefore, these copulas are used in the case study of this chapter, the forms of which are listed in Table 13.1.

13.3 Determination of ANN model inputs based on copula entropy In this section, theory of an analytical technique, namely CE method, used for input selection for the ANN model is described and methodology for its application is explained. The analytical technique is described into three subsections: first, the CE theory is introduced, second, theory of PMI is provided, and third, procedure of the input selection for the ANN model is introduced using the CE method following the CE and PMI theories.

TABLE 13.1 Details of three types of Archimedean copulas. Family Gumbel

Equations (



exp 

n P i¼1

Clayton



ðln ui Þq

i ¼ 1; 2; .; d q ˛ ½1; NÞ

1=q uiq  1

i ¼ 1; 2; .; d q ˛ ð0; NÞ

3 Qn qui 7 6 ð1e Þ i¼1 7 1q log6 5 41  1e q

i ¼ 1; 2; .; d q˛R

n P i¼1

Frank

Domain 1=q )

2

Flood forecasting and error simulation using copula Chapter | 13

339

13.3.1 Methodology 13.3.1.1 Copula entropy theory Recent studies have shown that copula modeling provides a simple, yet powerful, framework for modeling interdependence among hydrological data (Nazemi and Amin, 2012). The copula function has been widely used in hydrology for various tasks such as flood frequency analysis (Favre et al., 2004; Chen et al., 2010, 2012) and drought analysis (Song and Singh, 2010 a,b; Chen et al., 2013). However, most of the studies are focused on employing the copula function to establish the joint distribution between two variables. In this chapter, the entropy of the copula function, named CE, is proposed to measure the dependence between variables. The CE is defined as follows. Let X1 ; X2 be two random variables with marginal functions Fðx1 Þ; Fðx2 Þ and U1 ¼ Fðx1 Þ; U2 ¼ Fðx2 Þ, respectively. Then U1 and U2 are uniformly distributed random variables; and u1 and u2 denote a specific value of U1 and U2 , respectively. We define the entropy of copula function as CE, which is expressed as (Chen and Guo, 2019) Z 1Z 1 HC ðU1 ; U2 Þ ¼  cðu1 ; u2 Þlog½cðu1 ; u2 Þdu1 du2 (13.7) 0

0

where cðu1 ; u2 Þ is the probability density function of copula and is expressed vCðu1 ; u2 Þ . as vu1 vu2 The relationship between CE and MI is discussed as follows. The joint probability density function of variables X1 and X2 is defined as (Grimaldi and Serinaldi, 2006) f ðx1 ; x2 Þ ¼ cðu1 ; u2 Þf ðx1 Þ$f ðx2 Þ

(13.8)

Using expressions of CE and joint probability density functions, the joint entropy H(X1,X2) is expressed as (Chen and Guo, 2019)

340 Advances in Streamflow Forecasting

ZN ZN HðX1 ; X2 Þ ¼ 

f ðx1 ; x2 Þ log½f ðx1 ; x2 Þ dx1 dx2 0

0

ZN ZN ¼

cðu1 ; u2 Þf ðx1 Þf ðx2 Þlog½cðu1 ; u2 Þf ðx1 Þf ðx2 Þ dx1 dx2 0

0

ZN ZN ¼

cðu1 ; u2 Þf ðx1 Þf ðx2 Þflog½cðu1 ; u2 Þ þ log½f ðx1 Þ þ log½f ðx2 Þgdx1 dx2 0

0

ZN ZN ¼

cðu1 ; u2 Þf ðx1 Þf ðx2 Þ$log½cðu1 ; u2 Þ dx1 dx2 0

0

ZN ZN 

cðu1 ; u2 Þf ðx1 Þf ðx2 Þ$flog½f ðx1 Þ þ log½f ðx2 Þgdx1 dx2 0

0

¼AþB (13.9) ZN ZN cðu1 ; u2 Þf ðx1 Þf ðx2 Þflog½f ðx1 Þ þ log½f ðx2 Þgdx1 dx2

A¼ 0

0

ZN ZN ¼

f ðx1 ; x2 Þ$flog½f ðx1 Þ þ log½f ðx2 Þgdx1 dx2 0

0

ZN ZN ¼

ZN ZN f ðx1 ; x2 Þ$log½f ðx1 Þdx1 dx2 

0

ZN ¼ 0

ZN  0

0

2

log½f ðx1 Þ 4 2 log½f ðx2 Þ 4

0

ZN ZN 0

3 f ðx1 ; x2 Þ$ dx1 5 dx2

0

ZN f ðx1 Þlog½f ðx1 Þ dx1 

0

0

f ðx1 ; x2 Þ$ dx2 5 dx1

0

ZN ZN 0

3

ZN ¼

f ðx1 ; x2 Þ$log½f ðx2 Þdx1 dx2

f ðx2 Þlog½f ðx2 Þdx2 ¼ HðX1 Þ þ HðX2 Þ 0

(13.10) Using du1 ¼ dx1 $f ðx1 Þ and du2 ¼ dx2 $f ðx2 Þ,

Flood forecasting and error simulation using copula Chapter | 13

341

ZN ZN B¼ 

cðu1 ; u2 Þf ðx1 Þf ðx2 Þ$log½cðu1 ; u2 Þ dx1 dx2 0

0

ZN ZN ¼

cðu1 ; u2 Þ$log ½cðu1 ; u2 Þdu1 du2 ¼ HC ðU1 ; U2 Þ 0

(13.11)

0

Therefore, the joint CE is expressed as the sum of the d univariate marginal entropies and the CE as follows (Chen and Guo, 2019): HðX1 ; X2 Þ ¼ HðX1 Þ þ HðX2 Þ þ HC ðU1 ; U2 Þ

(13.12)

Thus, it is seen that the joint entropy HðX1 ; X2 Þ is expressed as the sum of the marginal entropies HðX1 Þ; HðX2 Þ and the CE HC ðU1 ; U2 Þ. Among the measures of independence between random variables, MI is singled out by its information theoretical background (Cover, 1991). In contrast to the LCC, MI is sensitive also to dependences which do not manifest themselves in the covariance. In the following theoretical description, we shall assume that the density is a proper smooth function, although we could also allow more densities of X1 and X2 are R singular densities. The marginal R Fx1 ðx1 Þ ¼ f ðx1 ; x2 Þdx2 and Fx2 ðx2 Þ ¼ f ðx1 ; x2 Þdx1 . The MI is defined as (Kraskovet al., 2004) Z Z f ðx1 ; x2 Þ TðX1 ; X2 Þ ¼ dx1 dx2 f ðx1 ; x2 Þlog (13.13) fx1 ðx1 Þfx2 ðx2 Þ The base of the logarithm determines the units in which information is measured. MI can be obtained by estimating in this way HðX1 Þ, HðX2 Þ, and HðX1 ; X2 Þ separately (Cover, 1991). TðX1 ; X2 Þ ¼ HðX1 Þ þ HðX2 Þ HðX1 ; X2 Þ

(13.14)

Therefore, from equations of CE and MI, one can write TðX1 ; X2 Þ ¼ HðX1 Þ þ HðX2 Þ HðX1 ; X2 Þ ¼ HC ðX1 ; X2 Þ

(13.15)

The above expression of MI indicates that the MI is equal to the negative CE. As the MI is always positive, the value of CE is always negative, and the reasons are as follows (Chen and Guo, 2019): According to Eqs. (13.2) and (13.3), joint entropy HðX1 ; X2 Þ is expressed as

342 Advances in Streamflow Forecasting

ZN ZN HðX1 ; X2 Þ ¼ 

f ðx1 ; x2 Þlog f ðx1 ; x2 Þdx1 dx2 0

0

ZN ZN ¼

f ðx1 ; x2 Þlog½f ðx2 jx1 Þf ðx1 Þdx1 dx2 0

0

ZN ZN ¼

ZN ZN f ðx1 ; x2 Þlog f ðx2 jx1 Þdx1 dx2 

0

0

f ðx1 ; x2 Þlog f ðx1 Þdx1 dx2 0

¼ HðX2 jX1 Þ 

log f ðx1 Þdx1 0

0

ZN

ZN

f ðx1 ; x2 Þdx2 0

ZN ¼ HðX2 jX1 Þ 

f ðx1 Þlog f ðx1 Þdx1 ¼ HðX2 jX1 Þ þ HðX1 Þ 0

(13.16) Then, Eq. (13.15) is written as TðX1 ; X2 Þ ¼ HC ðX1 ; X2 Þ ¼ HðX1 Þ þ HðX2 Þ HðX1 ; X2 Þ ¼ HðX2 Þ HðX2 jX1 Þ  0

(13.17)

13.3.1.2 Partial mutual information MI method is used to identify the nonlinear dependence between candidate input and output variables (Fernando et al., 2009). However, this method is not directly able to deal with the issue of redundant inputs (Bowden et al., 2005a). Thus, to overcome this problem, Sharma (2000) introduced the concept of PMI, which is a more general approach as it also relates to nonlinear dependencies, and it needs no explicit modeling. It represents the information between two observations that is not contained in a third one and provides a measure of the partial or additional dependence that the new input can add to the existing prediction model (Bowden et al., 2005a). The PMI between the output (dependent variable) y and the input (independent variable) x, for a set of pre-existing inputs z, is given by (Bowden et al., 2005a)   ZZ fX 0 ;Y 0 ðx0 ; y0 Þ 0 0 PMI ¼ fX 0 ;Y 0 ðx ; y Þln (13.18) dx0 dy0 fX 0 ðx0 ÞfY 0 ðy0 Þ where x0 ¼ x E½xjz; y0 ¼ y E½yjz, where E denotes the expectation operation. Variables x0 and y0 only contain the residual information in variables x and y after considering the effect of already selected input z (Fernando et al., 2009).

Flood forecasting and error simulation using copula Chapter | 13

343

The calculation of PMI involves two steps: (i) calculation of variables x0 and y0, and (ii) calculation of PMI. In the first step, one kind of ANN model, namely, the GRNN model, is used to estimate the variables x0 and y0, as the GRNN model provides an estimate of E[y|X], which is the conditional expectation of y given x. Then, in the second step, the CE method is used to calculate the PMI value. Following the relationship between CE and MI (or PMI), the PMI is obtained based on the CE method. Therefore, the calculation of PMI is converted to the calculation of CE. The relation between PMI and CE is expressed as (Chen and Guo, 2019)   ZZ fX 0 ;Y 0 ðx0 ; y0 Þ PMI ¼ fX 0 ;Y 0 ðx0 ; y0 Þln (13.19) dx0 dy0 ¼ Hc ðx0 ; y0 Þ fX0 ðx0 ÞfY 0 ðy0 Þ Therefore, the PMI is converted to calculate the negative CE. The input variables are determined based on the CE method.

13.3.1.3 Input selection based on copula entropy method First, the copula function between the potential inputs and output is derived. Parameters of the copula function are estimated, and then the copula probability density function is determined. Second, the values of CE are derived using the multiple integration method and CE values are calculated using Eq. (13.7). Once the integrand function is known from the first step, the multiple integration method as proposed by Berntsen et al. (1991) is applied to do multiple integrations. In order to test this multiple integration method, the copula probability density function is used as an integrand. The result of integration should be 1. Third, the CE algorithm requires a reliable and efficient criterion to decide when to stop the addition of new inputs to the list of selected inputs (Fernando et al., 2009). Hampel identifier is an outlier detection method for determining whether a given value x is significantly different from others within a set of values X (Davies and Gather, 1993). It is assumed that a set of candidates initially contain some proportion of redundant variables, and only significant variables are then detected from the whole set. The Hampel distance is computed by calculating the absolute deviation from the median PMI for all candidates and is defined as (May et al., 2008b; Fernando et al., 2009; Chen and Guo, 2019)   dj ¼ CEj  CEð50Þ  (13.20) where dj is the absolute deviation of the jth candidate, CEj represents the copula entropy values of the jth candidate, and CEð50Þ denotes the median copula entropy value for the candidate set. Then, the Hampel distance is calculated by following expression (Fernando et al., 2009; May et al., 2008b):

344 Advances in Streamflow Forecasting

Zj ¼

dj ð50Þ

(13.21)

1:4826dj

ð50Þ

where dj denotes the Hampel distance, and dj denotes the median absolute deviation dj . If the Hampel distance is greater than 3, namely Zj > 3, then the candidate inputs are considered influential and are added to the selected input set.

13.3.2 Application of copula entropy theory in flood forecastingda case study In previous sections, theoretical description of the CE has been adequately discussed and explained. In this section, a case study is presented where application of the new analytical CE technique is illustrated in selecting influential input data prior to using the adopted ANN modeling for flood forecasting and simulating the uncertainties of the model for two locations of the Yangtze River, China.

13.3.2.1 Study area and data description The case study was performed in the Yangtze River, China, which is the longest river in Asia and the third longest river of the world. First, the inflow at Three Gorges Reservoir (TGR) located in the upper Yangtze River is predicted based on the previous flow at upstream stations. Second, the streamflow at the outlet of the Jinsha River is predicted for testing the new analytical method of input selection for the ANN modeling. 13.3.2.2 Flood forecasts at Three Gorges Reservoir The TGR was designed based on the streamflow data at the Yichang gauging station. The used streamflow data at Yichang gauging station were naturalized, and the storage effects of TGR were removed. This streamflow data were used to represent the input flow of TGR, which cannot be measured directly. Therefore, the flood forecasting model presented here aims to predict the input flow of TGR. The previous streamflow data at six gauging stations located in the main tributaries in the upstream portion, namely Jinsha, Min, Tuo, Jialing, and Wu rivers, were considered as the potential input candidates, and the streamflow at Yichang station at time t was considered as output. The concurrent mean daily streamflow data from the year 1998 to 2007 at these stations were used in this study, and 8 years of the daily flow (1998e2005) were used for training of the ANN model, and 2 years of the daily flow (2006e07) were used for the model validation. The new analytical CE method was used to establish the appropriate ANN model and identify the significant input variables.

Flood forecasting and error simulation using copula Chapter | 13

345

13.3.2.3 Flood forecasting at the outlet of Jinsha River Jinsha River is the westernmost of the major headwater streams of the Yangtze River. It flows through the Qinghai, Sichuan, and Yunnan provinces in western China. The Jinsha River basin is divided into nine subbasins controlled by the Pingshan gauging station. The rainfall data from those subbasins used in this study were taken as input. The runoff/streamflow at the outlet of the Jinsha River was considered as output. The areal average rainfall of each subbasin was calculated. The new analytical method was used to establish the appropriate ANN model and identify the significant inputs. In Jinsha River, of the total 7 years (2004e10) of daily data, 5 years’ (2004e08) daily data were used for training of the ANN model, and 2 years’ (2009e10) data were used for the model validation. 13.3.2.4 Performance evaluation Three performance evaluation indices were used to evaluate the performance of the newly established ANN models, and the one with the best performance was finally selected for flood forecasting. The performance of hydrological forecasting models in China is generally assessed in accordance with the criteria specified by the Ministry of Water Resources of China (MWR, 2006), which include the coefficient of efficiency (i.e., NasheSutcliffe efficiency [NSE]), which is a measure of the goodness of fit between recorded and predicted streamflow time series data and the “qualified rate” (a) of predicted individual flood event peak discharges and volumes (Li et al., 2010). A forecasted peak discharge or flood volume is termed “qualified” when the difference between the predicted and the observed values is within 20% of the observed values. The root mean square error (RMSE) between observed and predicted flood values was also used as a performance criterion. The formulas of these criteria are given as follows: NSE is defined as (Li et al., 2010)  2 3 2 PN bt  Q Q t i¼1 6 7 NSE ¼ 41  P  (13.22) 2 5  100% N  Q Q t i¼1 bt represent the where N represents total number of observations; Qt and Q observed and predicted streamflow values at time t, respectively; and Q means the mean value of Qt . The RMSE criterion is defined as (Chen and Guo, 2019) vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2 u uPN bt u i¼1 Qt  Q t RMSE ¼ (13.23) N

346 Advances in Streamflow Forecasting

13.3.2.5 Results of selected model inputs Bowden et al. (2005b) proposed a two-step procedure for input selection using PMI, and the same procedure was adopted in this study. The first step is called bivariate stage, which aims to determine the significant lag of each variable. The second step is called multivariate stage, in which the significant lags selected in the previous step are combined to form a subset of candidates. Then, the final set of significant inputs can be obtained using the same method as in the first step. Details of the method are described as follows. If the number of candidate variables is dði:e:x 1 ; x2 ; xi ; .; xd Þ and the output variable is yt , then their own

past values xi;t1 ; xi;t2 ; .; xi;tk and ðyt1 ; yt2 ; .; yti Þ are the potential inputs, where k refers to the maximum lag that has been included as a potential input. For TGR, except for time t, the first 13 lags of each variable were used as the candidate inputs. For Jinsha River, k was equal to 15.

First, the PMI values for TGR and CE values for Jinsha River between each of xi;t1 ; xi;t2 ; .; xi;tk and yt and between each of ðyt1 ; yt2 ; .; yti Þ and yt were calculated. The significant lags that had the highest MI values and the maximum negative CE values were selected. Then, the PMI value, CE value, and its standardized Z-value were calculated in each iteration, the maximum of which with its Z-value greater than 3 was selected. The final selected results for both the stations in the first step are summarized in Table 13.2. During this step, for the TGR, the original 78 inputs were reduced to 12 inputs. For the Jinsha River, the original 180 inputs were reduced to 17 inputs. Thus, the method is regarded as performing adequately. Second, the significant lags selected in the first step were combined to form a subset of the input candidates. The PMI values for TGR and the Jinsha River were calculated based on the CE method. During this step, for TGR, 12 inputs were reduced to 7, and for Jinsha River, 17 inputs were reduced to 12. The final selected variables for the ANN model are given in Table 13.3. The selected significantly influential variables based on CE method were used as input to the ANN model. In TGR, of the total 10 years (1998e2007) of data, 8 years’ (1998e2005) data were used for training the ANN model, and 2 years’ (2006e07) data were used for model validation. In Jinsha River, of the total 7 years (2004e10) of data, 5 years’ (2004e08) data were used for training the ANN model, and 2 years’ (2009e10) data were used for the model validation. The streamflow hydrographs of the observed and predicted daily streamflows are shown in Fig. 13.2. It can be seen that the proposed ANN model performed quite well as the predicted streamflow values closely followed the observed values. Bowden et al. (2005a) pointed out that the linear correlation analysis is the most popular analytical technique for selecting appropriate input, and hence, Pearson LCCs were also used for input selection in this case study. The two input datasets, one selected by the CE method and other selected by the LCC

Flood forecasting and error simulation using copula Chapter | 13

347

TABLE 13.2 Final selection of influential input variables in first step one of two case studies. Basin

Stations

Selected inputs

Three Gorges Reservoir

Pingshan

t-1, t-2, t-4

Gaochang

t-3

Beibei

t-2

Lijiawan

t-3

Wulong

t-2

Yichang

t-1, t-2, t-3, t-4, t-5

Pingshan

t-1, t-2, t-3, t-4, t-5

Tongzilin

t-2

Shigu

t-3

Subbasin 1

t-5

Subbasin 2

t-4, t-6

Subbasin 3

t-4

Subbasin 4

t-4

Subbasin 5

t-5

Subbasin 6

t-4

Subbasin 7

t-5

Subbasin 8

t-6

Jinsha River

method, were employed to forecast floods at GR and Jinsha River stations. Values of the performance evaluation criteria were calculated, and the results are given in Table 13.4. It can be seen that the RMSE values for the inputs selected by the CE method are smaller than those for the inputs selected by LCC method. Also, the NSE and qualified rate for the inputs selected by the CE method are higher than those for the inputs selected by the LCC method. Therefore, the flood forecasting model with the selected input dataset by the CE method is better than that selected by the LCC method.

13.4 Flood forecast uncertainties Uncertainty involved in the forecasted streamflow values is an inherent and important characteristic of the forecasting process that provides a measure of the accuracy of the employed prediction model. The uncertainty of streamflow

348 Advances in Streamflow Forecasting

TABLE 13.3 Final selected inputs variables for the artificial neural network model based on the copula entropy method. Basin

Stations

Selected inputs

Three Gorges Reservoir

Pingshan

t-1

Gaochang

t-3

Beibei

t-2

Lijiawan

t-3

Wulong

t-2

Yichang

t-1, t-2

Pingshan

t-1, t-2, t-3, t-4, t-5

Tongzilin

t-2

Shigu

t-3

Subbasin 1

t-5

Subbasin 2

t-4

Subbasin 3

t-4

Subbasin 4

t-4

Subbasin 5

t-5

Jinsha River

forecasts is identified as one of the major influencing factors on using streamflow forecasts in real-time reservoir operation models and water resources management (Li et al., 2010; Zhao et al., 2011, 2013; Yan et al., 2013). One critical issue in reservoir risk assessment is the determination of the probability distribution of inflow forecast errors or determination of uncertainties. In literature, many studies analyzed the influence of uncertainty on real-time reservoir operation or water resources management by assuming that the relative forecast error at each time is approximately normally distributed (Christensen, 2003; Stedinger et al., 2008). For example, Li et al. (2010) determined the flood limited water level by considering the influence of inflow forecast uncertainty, which follows a normal distribution. Zhao et al. (2011) analyzed the effect of uncertainty in streamflow forecasts on real-time reservoir operation by assuming a normal distribution of forecasting uncertainty. Yan et al. (2013) quantified inflow forecast errors and their impact on reservoir flood control operations by assuming the inflow forecast errors to be normally distributed. The Chinese design flood guidelines also assume the distribution of flood forecast errors to have a normal distribution (http://www. sciencedirect.com/science/article/pii/S0022169410004245; MWR, 2006).

Flood forecasting and error simulation using copula Chapter | 13

2009

25000

Observed daily flow Predicted daily flow

20000 Flow (m3/s)

349

15000 10000 5000 0 Jun-13

Jul-13

Jul-13

Aug-13

Sep-13

Flow (ma/s)

(a)Jinsha River

35000 30000 25000 20000 15000 10000 5000 0 Jun-06

2006 Observed CE

Jul-06

Aug-06

Sep-06

(b)TGR

FIGURE 13.2 Comparison between observed and predicted runoff values. (A) Jinsha River, and (B) Three Gorges Reservoir.

However, whether or not the flood forecast error follows a normal distribution is still to be investigated. In reality, the forecast error or uncertainty does not always follow the normal distribution. For example, Zhao et al. (2013) selected the TGR as a case study and indicated that distribution of the real-world forecast uncertainty was non-Gaussian or non-normal with heavy tails. Diao et al. (2007) developed a model based on the principle of maximum entropy and indicated that the forecast errors approximately follow normally distribution in the humid and semihumid regions of China. Zhao et al. (2013) indicated that forecast uncertainty generally decreases over time, as more hydrologic information became available. The flood forecast error may follow a different distribution, when the flood forecast lead time increases. Therefore, it may be difficult to justify that the forecast error always follows the normal distribution. Although forecast uncertainty analysis has been a focus of research in hydrology, there are comparatively limited studies on exploring the distribution of uncertainty. Therefore, one of the objectives of this study is to determine the distribution of uncertainty and investigate whether this kind of uncertainty or flood forecast error follows the normal distribution or not. The flood error series with different flood forecasting lead times are determined. Several statistical

NasheSutcliffe efficiency

Root mean square error (m3/s)

Qualified rate

Basin

Methods

Training

Validation

Training

Validation

Training

Validation

TGR

LCC

0.9231

0.9036

1476

2932

0.9857

0.8566

CE

0.9402

0.9341

1281

2423

0.9898

0.9590

LCC

0.9781

0.9524

548

751

0.9880

0.9786

CE

0.9674

0.9170

649

906

0.9738

0.9393

Jinsha River

CE, copula entropy; LCC, linear correlation coefficient; TGR, Three Gorges Reservoir.

350 Advances in Streamflow Forecasting

TABLE 13.4 Comparisons of results obtained for flood forecasting at two stations with different input variables.

Flood forecasting and error simulation using copula Chapter | 13

351

FIGURE 13.3 Schematic of streamflow predicted at time (where Q is the predicted streamflow; q is the observed streamflow; and e is the relative error between predicted and observed streamflows).

distributions, which are widely used in hydrology, are selected to fit the flood relative error series. The goodness-of-fit criteria coupled with hypothesis testing are applied to justify the use of each distribution. The flood forecasting uncertainty is measured by flood forecasting errors. The relative flood forecasting error was expressed as follows (Zhang et al., 2015): et;tþs ¼

Qt;tþs  qtþs ; s ¼ 1; 2; .; h qtþs

(13.24)

where Qt;tþs is the predicted streamflow at time t for time t þ s; h is the lead time of flood forecast; qtþs is the observed streamflow at time t þ s; and et;tþs is the relative error between predicted and observed streamflows at time t for time t þ s. A schematic of flood forecasting for different lead times is shown in Fig. 13.3.

13.4.1 Distributions for fitting flood forecasting errors Seven distributions, i.e., normal, P-III, GEV, GP, Gumbel, Wakeby, and exponential distributions, which are shown in Table 13.5, were used for fitting the flood forecasting error data. Two criteria, i.e., bias and RMSE, which are shown in Table 13.6, were used to estimate the goodness of fit of these mentioned distributions.

352 Advances in Streamflow Forecasting

TABLE 13.5 Seven statistical distributions used in the case study. Distribution

Probability distribution function

Normal distribution

F ðxÞ ¼

Pearson type three distribution (P-III)

F ðxÞ ¼

Generalized extreme distribution (GEV) Generalized Pareto distribution (GP)

Rx

p1ffiffiffiffiffi 0 s 2pe

2 ðxmÞ 2s2

R x ba ðxuÞa1 0

GðaÞ

Parameter x is the forecast error value; m is the mean of the distribution; and s is its standard deviation.

dx

e bðxuÞ dx

h  i1=x F ðxÞ ¼ exp  1 þ x xu s

h  i1=k F ðxÞ ¼ 1  1 k xu s

a is the shape parameter; b is the scale parameter; m is the location parameter m > 0. u is the location parameter; s is the scale parameter; x is the shape parameter. k is the shape parameter; s is the scale parameter; and u is the location parameter.

Gumbel distribution

F ðxÞ ¼ exp½ e aðxmÞ 

s is the scale parameter; u is the location parameter.

Wakeby distribution

F ðxÞ ¼ z þ ba ð1 ð1  xÞb Þ  gd ð1 ð1  xÞd Þ

a, b, g, d, and z are parameters.

Exponential distribution (EX)

  F ðxÞ ¼ 1 exp xu s

u is the shape parameter; and s is the scale parameter.

TABLE 13.6 Details of two criteria used for evaluating performance of streamflow forecasting models in this study. Criteria Bias

Root mean square error (RMSE)

Expression Bias ¼ N1

RMSE ¼

Parameter

N

 P Pthe ðiÞ Pemp ðiÞ Pemp ðiÞ

i¼1

1 n

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2 n P ðPthe ðiÞPemp ðiÞÞ i¼1

Pemp

Pthe is the theoretical probability; Pthe(i) ¼ F(xi); and i  0:44 . pemp ðiÞ ¼ n þ 0:12

Flood forecasting and error simulation using copula Chapter | 13

353

13.4.2 Determination of the distributions of flood forecasting uncertainties at TGR The streamflow of TGR consists of three components, the main upstream inflow, the tributary inflow from the Wu River, and the rainfall-runoff from the TGR intervening basin. The intervening basin has a catchment area of 55,907 km2, about 5.6% of the TGR upstream Yangtze River basin. Over the whole intervening basin, there are 40 rain gauge stations and two hydrographic stations (Cuntan and Wulong), which control the upstream inflow and tributary inflow, respectively (Li et al., 2010). The inflow of the TGR was the total streamflow of Cuntan, Wulong, and the intervening basin. As the Xinanjiang model was tested to be the most appropriate model for predicting the streamflow of the intervening basin of TGR and has been applied in the realtime flood forecasting system, this model was used to simulate the rainfallrunoff in the intervening basin with different lead times. The daily streamflow data of 2003/6/1 to 2006/12/31 period were used for flood forecasting with 1-day, 2-day, and 3-day lead times. The relative errors of 1-day, 2-day, and 3-day lead times are shown in Fig. 13.4. It can be seen that the relative error increases with increasing the lead time. The normal, Wakeby, exponential, Gumbel, Pearson type three, generalized Pareto, and generalized extreme value distributions were used to fit the 1day, 2-day, and 3-day relative flood forecasting errors for TGR. The L-moment method was employed to estimate the parameters of these distributions. Values of the estimated parameters are listed in Table 13.7. The bias and RMSE of the observed and theoretical streamflow forecasting errors are given in Table 13.8. The chi-squared (c2 ) test was employed as the goodness-of-fit test to evaluate the validity of the assumption that the relative flood error followed the distributions mentioned above. The results of the goodness-of-fit tests are summarized in Table 13.9 where the symbol “O” indicates the distribution that satisfies the criteria, and the symbol “” means that the particular distribution is not fitted well from results of the criterion. The results indicated that the Wakeby is the only distribution that could not be rejected by both the goodness-of-fit criteria at 5% significance level. The frequency histograms of the flood forecast errors for TGR station fitted with the normal and Wakeby distributions are shown in Fig. 13.5. The marginal distributions of flood forecasting errors are shown in Fig. 13.6, in which the line represents the theoretical distribution and the symbols indicate the empirical frequencies of observations. Both Figs. 13.5 and 13.6 indicated that the Wakeby distribution fitted the streamflow data adequately, and thus, it is a better fit than the normal distribution. Based on this result, it is concluded that the assumption of normal distribution in forecasting errors may not be justified in some cases. Therefore, it is necessary to carry out the hypothesis testing first to verify whether the normality assumption is true or not for the errors of streamflow forecasts prior to further use.

354 Advances in Streamflow Forecasting

FIGURE 13.4 The relative error series with 1-day, 2-day, and 3-day lead time at the Three Gorges Reservoir.

13.5 Flood forecast uncertainty simulation 13.5.1 Flood forecasting uncertainties simulation based on copulas As the lead time of a flood forecasting model is generally more than 1 day and there are dependences among uncertainties with different lead times, simulation of flood forecast uncertainties is a multivariate problem. The copula method was selected for simulating flood forecast uncertainties with different

Flood forecasting and error simulation using copula Chapter | 13

355

TABLE 13.7 Parameters of the selected distributions for fitting the flood error series. Parameters Order

Distributions

1-day

2-day

3-day

1

Normal

m ¼ 0.45 s ¼ 4.52

m ¼ 0.92 s ¼ 6.32

m ¼ 1.81 s ¼ 8.58

2

Pearson type three

a ¼ 16.00 b ¼ 0.63 m ¼ 24.56

a ¼ 34.60 b ¼ 1.30 m ¼ 26.26

a ¼ 5.54 b ¼ 0.27 m ¼ 18.85

3

Generalized extreme value

m ¼ 1.81 s ¼ 4.73 z ¼ 0.39

m ¼ 2.68 s ¼ 6.73 z ¼ 0.43

m ¼ 3.7 s ¼ 9.46 z ¼ 0.55

4

Generalized Pareto

m ¼ 871 s ¼ 1849 k ¼ 1.24

m ¼ -1287 s ¼ 2812 k ¼ 1.35

m ¼ -1949 s ¼ 4693 k ¼ 1.65

5

Exponential

m ¼ 5.55 a ¼ 5.10

m ¼ 8.05 a ¼ 7.13

m ¼ 11.49 a ¼ 9.68

6

Gumbel

m ¼ 2.58 a ¼ 3.68

m ¼ 3.88 a ¼ 5.14

m ¼ 5.84 a ¼ 6.99

7

Wakeby

a ¼ 166.9 b ¼ 11.60 g ¼ 2.91 d ¼ 5.28 z ¼ 16.76

a ¼ 193.09 b ¼ 10.05 g ¼ 4.25 d ¼ 1.22 z ¼ 22.59

a ¼ 229 b ¼ 8.75 g ¼ 6.21 d ¼ 0.11 z ¼ 30.9

s, standard deviation; a, b, g, k, z, and d are the parameters in their distribution; m, expectation.

TABLE 13.8 Results of calculated Bias and RMSE for 1-day, 2-day, and 3-day series. 1-day

2-day

3-day

Distribution

Bias

RMSE

Bias

RMSE

Bias

RMSE

Normal

0.0063

4.97

0.0111

4.50

0.0206

5.04

P-III

0.0051

4.88

0.0047

3.97

0.0054

3.25

GEV

0.0051

5.34

0.0052

4.50

0.0063

3.83

GP

0.0075

8.22

0.0096

7.55

0.0136

7.25

Ex

0.0712

12.19

0.0730

12.39

0.0781

13.40

Gumbel

0.0412

7.73

0.0448

7.94

0.0524

9.09

Wakeby

L0.0016 1.34 L0.0007 1.52 L0.0005 1.44

Ex, exponential distribution; GEV, generalized extreme distribution; GP, generalized Pareto distribution; P-III, Pearson type III distribution; RMSE, root mean square error.

1-day

2-day

3-day

Distribution

c2

c2 (7)

Select

c2

c2 (7)

Select

c2

c2 (7)

Select

Normal

1306

14.07



430

14.07



187

14.07



P-III

261

14.07



89

14.07



73

14.07



GEV

761

14.07



68

14.07



83

14.07



GP

1327

14.07



831

14.07



282

14.07



Ex

176

14.07



315

14.07



562

14.07



Gumbel

3201

14.07



232

14.07



228

14.07



Wakeby

14

14.07

O

7.25

14.07

O

7.43

14.07

O

c2 (7) means the chi-square test with the freedom of seven; “” means that the distribution is rejected; and “O” means that the distribution is accepted.

356 Advances in Streamflow Forecasting

TABLE 13.9 Results of c2 test for 1-day, 2-day, and 3-day series.

Flood forecasting and error simulation using copula Chapter | 13

357

FIGURE 13.5 Frequency histograms of flood forecasting errors fitted by the normal and Wakeby distributions.

358 Advances in Streamflow Forecasting 1-day Error 1.00

Probablity

0.80

0.60

0.40 Empirical

0.20

Wakeby

Normal 0.00 -50

-40

-30

-20

-10

0

10

20

30

40

50

Flow (m3/s)

2-day Error 1.00

Probablity

0.80

0.60

0.40 Empirical

0.20

Wakeby Normal 0.00

-50

-40

-30

-20

-10

0

10

20

30

40

50

Flow (m3/s)

3-day Error 1.00

Probablity

0.80

0.60

0.40 Empirical

0.20

Wakeby Normal 0.00 -50

-40

-30

-20

-10

0

10

20

30

40

50

Flow (m3/s)

FIGURE 13.6 Marginal distributions of flood forecasting errors fitted by the normal and Wakeby distributions.

lead times. The copula-based joint distribution of these uncertainties with different lead times can be expressed as (Chen and Guo, 2019) Fðe1 ; e2 ; .; es ; .; eh Þ ¼ Cðu1 ; u2 ; .; us ; .; uh Þ

(13.25)

where es ðs ¼ 1; 2; .; hÞ is the relative forecasting error series for lead time s; and us ðs ¼ 1; 2; .; hÞ is the CDF of es .

Flood forecasting and error simulation using copula Chapter | 13

359

The metaelliptical copula and vine copula are often employed to determine a higher dimensional joint distribution (Chen et al., 2016). For highdimensional variables, the regular vine copula embraces a large number of possible pair-copula decompositions (Aas et al., 2009), which means that some additional structure may be used to select a more suitable vine composition. The metaelliptical copula can model arbitrary pairwise dependences between variables though a correlation matrix. Therefore, considering the advantage of metaelliptical copulas and the limitations of vine copula, the metaelliptical copula was selected in this study. The Student t copula function was selected to construct the joint distribution for flood forecast errors, and the maximum likelihood method (MLE) method was used to estimate the parameters of the joint distribution. It has been shown above that the flood forecasting uncertainty series do not always follow the normal distribution. Thus, the distributions commonly used in hydrology, including exponential (EXP), generalized extreme value (GEV), generalized logistic (GLO), generalized Pareto (GPA), generalized normal (GNO), Gumbel, normal, Pearson type III (P-III), Wakeby, and Weibull distributions given in Table 13.5, were selected as candidate marginal distributions for fitting the uncertainties. The L-moment method was used to estimate the parameters of marginal distributions. Then, the best-fit distribution was selected based on the RMSE and Akaike information criterion (Akaike, 1974; Zhang and Singh, 2007, 2019; Chen et al., 2017). Based on the joint distribution, flood forecasting uncertainties were computed following the three-step procedure explained below. Step 1: Use copula function to derive the joint distribution of flood forecasting uncertainties Fðe1 ; e2 ; .; eh Þ, and estimate the parameters by the MLE method. (K should be omitted or be a subscript everywhere.) Step 2: Generate a uniform random error ε, and let Cðu1 ; u2 ; .; uh Þ ¼ ε. The vector ðu1 ; u2 ; .; uh Þ can be obtained from the inverse function C 1 ðu1 ; u2 ; .; uh Þ. Step 3: Obtain the error vector from the inverse function of the marginal distribution, es ¼ F 1 ðus Þ. Finally, simulate the forecast uncertainties by repeating steps (2) and (3).

13.5.2 Flood forecasting uncertainties simulation In real-time flood control operation of TGR station, flood forecasting with a lead time of 3 days was considered. A three-dimensional copula function was selected to construct the joint distribution of flood forecasting errors. All candidate distributions given in Table 13.5 were used to fit the flood uncertainty data. The calculated values of the bias and RMSE for each marginal distribution are given in Table 13.8, which showed that the Wakeby distribution was the best-fitted distribution for flood forecast errors at different lead

360 Advances in Streamflow Forecasting

FIGURE 13.7 Estimated parameters of the joint distribution (the unit of axes is %).

times. The estimated parameters of the marginal and joint distributions are shown in Table 13.8 and Fig. 13.7, respectively. Using simulation, the forecast error series were generated based on the Student t copula. As the forecasting lead time was 3 days, the vector ðu1 ; u2 ; u3 Þ was obtained from the inverse function C1 ðu1 ; u2 ; u3 Þ. The uncertainty vector ðe1 ; e2 ; e3 Þ was obtained from the inverse function of the Wakeby distribution, e1 ¼ F11 ðu1 Þ, e2 ¼ F21 ðu2 Þ, and e3 ¼ F31 ðu3 Þ, respectively. Therefore, the three-dimensional forecast series was simulated by repeating the simulation steps. The statistics of observed and simulated data of forecasting errors, including the mean error value, coefficient of variation (Cv), and skewness (Cs), were compared for the 3-day lead time (Chen et al., 2016). These statistics are illustrated by box plots in Fig. 13.8, which shows that the generated statistics adequately fitted to the observed statistics.

13.6 Conclusions In this chapter, first, a new analytical method based on the CE theory is presented to determine the significant or influential inputs of an ANN model used for streamflow forecasting. Application of the proposed method is demonstrated through two case studies performed in TGR. Accuracy of the proposed method is compared with the traditional method. Second, as the flood forecast uncertainty is important, the statistical characteristics of flood forecast uncertainties are discussed and statistical distributions are fitted. Considering the dependences among the flood forecast uncertainties with different lead times,

Flood forecasting and error simulation using copula Chapter | 13

361

FIGURE 13.8 Mean value, Cv, and Cs of the observed and simulated forecast errors for different lead times. (A) Mean value, (B) coefficient of variation, and (C) skewness.

362 Advances in Streamflow Forecasting

the copula-based method is proposed for flood forecast uncertainty simulation. The major conclusions are drawn as follows: (1) The results of two case studies concluded that the proposed ANN model combined with CE method performed quite well in flood forecasting. Also, the flood forecasting model with the selected input dataset based on the CE method is better than the model with input dataset selected by the method based on the traditional LCC. (2) Analysis showed that the assumption that the flood forecast uncertainties follow the normal distribution is not always true, and hence, this assumption for estimating flood risk may lead to incorrect results. Therefore, it is necessary to test normality hypothesis in streamflow forecasting error series prior to its subsequent use. (3) The proposed copula-based method can be used for flood forecast uncertainty simulation. The generated statistics adequately fitted to the observed statistics. The proposed simulation method can preserve the dependence structure of the flood forecast uncertainty with different lead times. Simulation of the flood forecast uncertainty is very useful for flood risk analysis.

References Aas, K., Czado, C., Frigessi, A., Bakken, H., 2009. Pair-copula constructions of multiple dependence. J. Math. Econ. 44 (2), 182e198. https://doi.org/10.1016/j.insmatheco.2007.02.001. Abrahart, R.J., See, L., 2002. Multi-model data fusion for river flow forecasting: an evaluation of six alternative methods based on two contrasting catchments. Hydrol. Earth Syst. Sci. 6 (4), 655e670. https://doi.org/10.5194/hess-6-655-2002. Akaike, H., 1974. Markovian representation of stochastic processes and its application to the analysis of autoregressive moving average processes. Ann. Inst. Stat. Math. 26 (1), 363e387. https://doi.org/10.1007/978-1-4612-1694-0_17. Alfonso, L., Lobbrecht, A., Price, R., 2010. Optimization of water level monitoring network in polder systems using information theory. Water Resour. Res. 46, W12553. https://doi.org/ 10.1029/2009WR008953. Alfonso, L., He, L., Lobbrecht, A., Price, R., 2013. Information theory applied to evaluate the discharge monitoring network of the Magdalena River. J. Hydroinf. 15 (1), 211e228. https:// doi.org/10.2166/hydro.2012.066. Bazrafshan, O., Zamani, H., Shekari, M., 2020. A copula-based index for drought analysis in arid and semi-arid regions of Iran. Nat. Resour. Model. 33 (1), e12237. https://doi.org/10.1111/ nrm.12237. Berntsen, J., Espelid, T.O., Genz, A., 1991. An adaptive algorithm for the approximate calculation of multiple integrals. ACM Trans. Math Software 17, 437e451. https://doi.org/10.1145/ 210232.210233. Bevacqua, E., Maraun, D., Haff, I.H., 2017. Multivariate statistical modelling of compound events via pair-copula constructions: analysis of floods in Ravenna. Hydrol. Earth Syst. Sci. 21 (6), 2701e2723 (in Italy). https://doi.org/10.5194/hess-21-2701-2017.

Flood forecasting and error simulation using copula Chapter | 13

363

Bhuyan-Erhardt, U., Erhardt, T.M., Laaha, G., 2019. Validation of drought indices using environmental indicators: streamflow and carbon flux data. Agric. For. Meteorol. 265, 218e226. https://doi.org/10.1016/j.agrformet.2018.11.016. Birikundavyi, S., Labib, R., Trung, H.T., Rousselle, J., 2002. Performance of neural networks in daily streamflow forecasting. J. Hydrol. Eng. ASCE 7 (5), 392e398. https://doi.org/10.1061/ (ASCE)1084-0699(2002)7:5(392). Bowden Gavin, J., Dandy Graeme, C., Maier Holger, R., 2005a. Input determination for neural network models in water resources applications. Part 1-background and methodology. J. Hydrol. 301 (1e4), 75e92. https://doi.org/10.1016/j.jhydrol.2004.06.021. Bowden Gavin, J., Maierb Holger, R., Dandy Graeme, C., 2005b. Input determination for neural network models in water resources applications. Part 2. Case study: forecasting salinity in a river. J. Hydrol. 301 (1e4), 93e107. https://doi.org/10.1016/j.jhydrol.2004.06.020. Calsaverini, R.S., Vicente, R., 2009. An information-theoretic approach to statistical dependence: copula information. Europ. Phys. Lett. 88 (6), 3e12. https://doi.org/10.1209/0295-5075/88/ 68003. Castellano-Me´ndeza, M., Gonza´lez-Manteigaa, W., Febrero-Bande, M., Prada-Sa´ncheza, M.J., Lozano-Caldero´n, R., 2004. Modeling of the monthly and daily behavior of the runoff of the Xallas river using BoxeJenkins and neural networks methods. J. Hydrol. 296, 38e58. https:// doi.org/10.1016/j.jhydrol.2004.03.011. Chen, L., Guo, S., 2019. Copulas and its Application in Hydrology and Water Resources. Springer water (Chapter 2-10). https://doi.org/10.1007/978-981-13-0574-0. Chen, L., Guo, S.L., Yan, B.W., Liu, P., Fang, B., 2010. A new seasonal design flood method based on bivariate joint distribution of flood magnitude and date of occurrence. Hydrol. Sci. J. 55, 1264e1280. https://doi.org/10.1080/02626667.2010.520564. Chen, L., Singh, V.P., Shenglian, G., Hao, Z., Li, T., 2012. Flood coincidence risk analysis using multivariate copula functions. J. Hydrol. Eng. ASCE 17 (6), 742e755. https://doi.org/10.1061/ (ASCE)HE.1943-5584.0000504. Chen, L., Singh, V.P., Guo, S., Mishra, A.K., Guo, J., 2013. Drought analysis based on copulas. J. Hydrol. Eng. ASCE 18 (7), 797e808. https://doi.org/10.1061/(ASCE)HE.19435584.0000697. Chen, L., Ye, L., Singh, V., Asce, F., Zhou, J., Guo, S., 2014. Determination of input for artificial neural networks for flood forecasting using the copula entropy method. J. Hydrol. Eng. ASCE 19 (11), 217e226. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000932. Chen, L., Singh, V.P., Lu, W., Zhang, J., Zhou, J., Guo, S., 2016. Streamflow forecast uncertainty evolution and its effect on real-time reservoir operation. J. Hydrol. 540, 712e726. https:// doi.org/10.1016/j.jhydrol.2016.06.015. Chen, J., Zhong, P.A., Zhang, Y., Navar, D., Yeh, W.W., 2017. A decomposition-integration risk analysis method for real-time operation of a complex flood control system. Water Resour. Res. 53, 2490e2506. https://doi.org/10.1002/2016WR019842. Christensen, S., 2003. A synthetic groundwater modelling study of the accuracy of GLUE uncertainty intervals. Nord. Hydrol. 35 (1), 45e59. https://doi.org/10.2166/nh.2004.0004. Cover, T.M., Thomas, J.A., 1991. Elements of Information Theory. Wiley, New York, p. 542. Davies, L., Gather, U., 1993. The identification of multiple outliers. J. Am. Stat. Assoc. 88 (423), 782e792. https://doi.org/10.2307/2290763. de Vos, N.J., Rientjes, T.H.M., 2005. Constraints of artificial neural networks for rainfall-runoff modelling: trade-offs in hydrological state representation and model evaluation. Hydrol. Earth Syst. Sci. 9 (34), 111e126. https://doi.org/10.5194/hess-9-111-2005.

364 Advances in Streamflow Forecasting Diao, Y.F., Wang, B.D., Liu, J., 2007. Study on distribution of flood forecasting errors by the method based on maximum entropy. J. Hydraul. Eng. ASCE 38 (5), 591e595. Durocher, M., Burn, D.H., Zadeh, S.M., 2018. A nationwide regional flood frequency analysis at ungauged sites using ROI/GLS with copulas and super regions. J. Hydrol. 567, 191e202. https://doi.org/10.1016/j.jhydrol.2018.10.011. Favre, A.-C., Adlouni, S., Perreault, L., Thie´monge, N., Bobe´e, B., 2004. Multivariate hydrological frequency analysis using copulas. Water Resour. Res. 40, W01101, 12. https://doi.org/10.1029/ 2003WR002456. Fernando, T.M.K.G., Maier, H.R., Dandy, G.C., 2009. Selection of input variables for data driven models: an average shifted histogram partial mutual information estimator approach. J. Hydrol. 367, 165e176. https://doi.org/10.1016/j.jhydrol.2008.10.019. Filipova, V., Lawrence, D., Klempe, H., 2018. Effect of catchment properties and flood generation regime on copula selection for bivariate flood frequency analysis. Acta Geophys. 66 (4), 791e806. https://doi.org/10.1007/s11600-018-0113-6. Fu, G., Butler, D., 2014. Copula-based frequency analysis of overflow and flooding in urban drainage systems. J. Hydrol. 510, 49e58. https://doi.org/10.1016/j.jhydrol.2013.12.006. Gauthier, T.D., 2001. Detecting trends using Spearman’s rank correlation coefficient. Environ. Forensics 2 (4), 359e362. https://doi.org/10.1006/enfo.2001.0061. Grimaldi, S., Serinaldi, F., 2006. Design hyetographs analysis with 3-copula function. Hydrol. Sci. J. 51 (2), 223e238. https://doi.org/10.1623/hysj.51.2.223. Hao, C., Zhang, J., Yao, F., 2017. Multivariate drought frequency estimation using copula method in Southwest China. Theor. Appl. Climatol. 127 (3e4), 977e991. https://doi.org/10.1007/ s00704-015-1678-5. Harmancioglu, N.B., Yevjevich, V., 1987. Transfer of hydrologic information among river points. J. Hydrol. 91, 103e111. https://doi.org/10.1016/0022-1694(87)90131-4. Hejazi, M.I., Cai, X., Ruddel, Benjamin, 2008. The role of hydrologic information to reservoir operations - learning from past releases. Adv. Water Resour. 31 (12), 1636e1650. https:// doi.org/10.1016/j.advwatres.2008.07.013. Hou, J., Fang, W., Cheng, M., 2019. Joint probability analysis of tropical cyclone wind and rainfall for integrated hazard severity assessment in Hainan. J. Nat. Prod. 28 (3), 54e64. Hsu, K.L., Gupta, H.V., Sorooshian, S., 1995. Artificial neural network modeling of the rainfallerunoff process. Water Resour. Res. 31 (10), 2517e2530. https://doi.org/10.1029/ 95WR01955. Jain, S.K., Das, A., Drivastava, D.K., 1999. Application of ANN for reservoir inflow prediction and operation. J. Water Resour. Plann. Manag. 125 (5), 263e271. https://doi.org/10.1061/(ASCE) 0733-9496(1999)125:5(263). Jhong, B.-C., Tung, C.-P., 2018. Evaluating future joint probability of precipitation extremes with a copula-based assessing approach in climate change. Water Resour. Manag. 32 (13), 4253e4274. https://doi.org/10.1007/s11269-018-2045-y. Joe, H., 1997. Multivariate models and multivariate dependence concepts. CRC Press, p. 424. Kapur, J.N., Kesavan, H.K., 1992. Entropy optimization principles and their applications. In: Singh V.P., Fiorentino M. (Eds), Entropy and energy dissipation in water resources. Water Science and Technology Library, volume 9. Springer, Dordrecht. p. 3e20. https://doi.org/10.1007/97894-011-2430-0_1. Kasiviswanathan, K.S., Sudheer, K.P., 2013. Quantification of the predictive uncertainty of artificial neural network based river flow forecast models. Stoch. Environ. Res. Risk Assess. 27 (1), 137e146. https://doi.org/10.1007/s00477-012-0600-2.

Flood forecasting and error simulation using copula Chapter | 13

365

Khan, S., Ganguly, A.R., Bandyopadhyay, S., Saigal, S., Erickson III, D.J., Protopopescu, V., Ostrouchov, G., 2006. Nonlinear statistics reveals stronger ties between ENSO and the tropical hydrological cycle. Geophys. Res. Lett. 33, L24402. https://doi.org/10.1029/2006GL027941. Kraskov, A., Sto¨gbauer, H., Grassberger, P., 2004. Estimating mutual information. Phys. Rev. 69 (6), 066138. https://doi.org/10.1103/PhysRevE.69.066138. Krstanovic, P.F., Singh, V.P., 1992a. Evaluation of rainfall networks using entropy: 1. Theoretical development. Water Resour. Manag. 6, 279e293. https://doi.org/10.1007/BF00872281. Krstanovic, P.F., Singh, V.P., 1992b. Evaluation of rainfall networks using entropy: 2. Application. Water Resour. Manag. 6, 295e315. https://doi.org/10.1007/BF00872282. Lachtermacher, G., Fuller, J.D., 1994. Backpropagation in hydrological time series forecasting. In: Hipel, K.W., McLeod, A.I., Panu, U.S., Singh, V.P. (Eds.), Stochastic and Statistical Methods in Hydrology and Environmental Engineering. Water Science and Technology Library, Springer, Dordrecht. volume 10/3, p. 229e242. https://doi.org/10.1007/978-94-017-3083-9_18. ´ vila, R., Mine, M.R.M., Kaviski, E., et al., 2020. Complementarity modeling of monthly Leandro A streamflow and wind speed regimes based on a copula-entropy approach: a Brazilian case study. Appl. Energy 259 (Feb.1), 114127.1e114127.12. https://doi.org/10.1016/j.apenergy. 2019.114127. Li, W., 1990. Mutual information functions versus correlation functions. J. Stat. Phys. 60, 823e837. https://doi.org/10.1007/BF01025996. Li, X., Guo, S.L., Liu, P., Chen, G.Y., 2010. Dynamic control of flood limited water level for reservoir operationby considering inflow uncertainty. J. Hydrol. 391, 124e132. https://doi.org/ 10.1016/j.jhydrol.2010.07.011. Li, H., Wang, D., Singh, V.P., 2019. Non-stationary frequency analysis of annual extreme rainfall volume and intensity using Archimedean copulas: a case study in eastern China. J. Hydrol. 571, 114e131. https://doi.org/10.1016/j.jhydrol.2019.01.054. Li, F., Zheng, Q., 2016. Probabilistic modelling of flood events using the entropy copula. Adv. Water Resour. 97, 233e240. https://doi.org/10.1016/j.advwatres.2016.09.016. Liu, C., Zhou, Y., Sui, J., 2015. Research of methodology of multivariate analysis of design storm based on 3-copula function. J. Harbin Inst. Technol. 47 (4), 87e92. https://doi.org/10.11918/ j.issn. 0367-6234.2015.04.015. Liu, C., Zhou, Y., Sui, J., 2018. Multivariate frequency analysis of urban rainfall characteristics using three-dimensional copulas. Water Sci. Technol. 1, 206e218. https://doi.org/10.2166/ wst.2018.103. Liu, Y.R., Li, Y.P., Ma, Y., 2020. Development of a Bayesian-copula-based frequency analysis method for hydrological risk assessment - the Naryn River in Central Asia. J. Hydrol. 580, 124349. https://doi.org/10.1016/j.jhydrol.2019.124349. MA, J., SUN, Z., 2008. Mutual information is copula entropy. Tsinghua Sci. Technol. 16 (1), 51e54. https://doi.org/10.1016/S1007-0214(11)70008-6. Maier Holger, R., Dandy Graeme, C., 2000. Neural networks for the prediction and forecasting of water resources variables: a review of modeling issues and applications. Environ. Model. Software 15, 101e124. https://doi.org/10.1016/S1364-8152(99)00007-9. May Robert, J., Dandy Graeme, C., Maier Holger, R., Nixon John, B., 2008a. Application of partial mutualinformation variable selection to ANN forecasting of water quality in water distribution systems. Environ. Model. Software 23, 1289e1299. https://doi.org/10.1016/ j.envsoft.2008.03.008. May Robert, J., Maier Holger, R., Dandy Graeme, C., Fernando, T.M.K., 2008b. Non-linear variable selection for artificial neural networks using partial mutual information. Environ. Model. Software 23, 1312e1326. https://doi.org/10.1016/j.envsoft.2008.03.007.

366 Advances in Streamflow Forecasting Mesbahzadeh, T., Miglietta, M.M., Mirakbari, M., Abdolhoseini, M., 2019. Joint modeling of precipitation and temperature using copula theory for current and future prediction under climate change scenarios in arid lands (case study, kerman province, Iran). Adv. Meteorol. 6848049. https://doi.org/10.1155/2019/6848049. Molini, A, La, Barbera, P., Lanza, L.G., 2006. Correlation patterns and information flows in rainfall fields. J. Hydrol. 322 (1e4), 89e104. https://doi.org/10.1016/j.jhydrol.2005.02.041. MWR (Ministry of Water Resources), 2006. Standard for Hydrological Information and Hydrological Forecasting. SL250-2000, (in Chinese). Nazemi, A., Amin, E., 2012. Application of copula modelling to the performance assessment of reconstructed watersheds. Stoch. Environ. Res. Risk Assess. 26 (2), 189e205. https://doi.org/ 10.1007/s00477-011-0467-7. Nelsen, R.B., 2006. An Introduction to Copulas, second ed. Springer Science & Business Media, p. 272. https://doi.org/10.1007/0-387-28678-0. Ng, W.W., Panu, U.S., Lennox, W.C., 2007. Chaos based analytical techniques for daily extreme hydrological observations. J. Hydrol. 342 (1e2), 17e41. https://doi.org/10.1016/j.jhydrol. 2007.04.023. Ozga-Zielinski, B., Ciupak, M., Adamowski, J., 2016. Snow-melt flood frequency analysis by means of copula based 2D probability distributions for the Narew River in Poland. J. Hydrol. Regional Studies 6, 26e51. https://doi.org/10.1016/j.ejrh.2016.02.001. Qian, L., Wang, H., Dang, S., 2018. Modelling bivariate extreme precipitation distribution for datascarce regions using Gumbel-Hougaard copula with maximum entropy estimation. Hydrol. Process. 32 (2), 212e227. https://doi.org/10.1002/hyp.11406. Raman, H., Sunilkumar, N., 1995. Multivariate modeling of water resources time series using artificial neural networks. Hydrol. Sci. J. 40 (2), 145e163. https://doi.org/10.1080/02626669 509491401. Saghafian, B., Sanginabadi, H., 2020. Multivariate groundwater drought analysis using copulas. Nord. Hydrol. 51 (4), 666e685. https://doi.org/10.2166/nh.2020.131. Salvadori, G., De Michele, C., 2010. Multivariate multiparameter extreme value models and return periods: a copula approach. Water Resour. Res. 46, W10501. https://doi.org/10.1029/2009 WR009040. Salvadori, G., De Michele, C., Kottegoda, N.T., Rosso, R., 2007. Extremes in Nature: An Approach Using Copulas. Springer Science & Business Media, p. 292. https://doi.org/10.1007/ 1-4020-4415-1. Serinaldi, F., Bonaccorso, B., Cancelliere, A., Grimaldi, S., 2009. Probabilistic characterization of drought properties through copulas. Phys. Chem. Earth 34 (10e12), 596e605. https://doi.org/ 10.1016/j.pce.2008.09.004. Shamseldin, A.Y., 1997. Application of a neural network technique to rainfall-runoff modelling. J. Hydrol. 199, 272e294. https://doi.org/10.1016/S0022-1694(96)03330-6. Shannon, C.E., 1948. Mathematical theory of communication. Bell Syst. Tech. J. xxvii 379e423. Sharma, A., 2000. Seasonal to interannual rainfall probabilistic forecasts for improved water supply management: Part 1 a strategy for system predictor identification. J. Hydrol. 239, 232e239. https://doi.org/10.1016/S0022-1694(00)00346-2. Singh, V.P., 2000. The entropy theory as a tool for modeling and decision making in environmental and water resources. J. Water Soci. Am. 1, 1e11. https://doi.org/10.1007/978-3-540-36212-8_15. Singh, V.P., 2013. Entropy Theory and its Application in Environmental Engineering. Wiley-Blackwell, Hoboken, New Jersey, USA, p. 642. https://doi.org/10.1002/9781118428306. Singh, V.P., 2014. Entropy Theory in Hydraulic Engineering: An Introduction. ASCE Press, Reston, Virginia, USA, p. 784. https://doi.org/10.1061/9780784412725.

Flood forecasting and error simulation using copula Chapter | 13

367

Singh, V.P., 2015. Entropy Theory in Hydrologic Science and Engineering. McGraw-Hill Education, New York, p. 824. Singh, V.P., 2016. Introduction to Tsallis Entropy Theory in Water Engineering. CRC Press, Boca Raton, Florida, USA, p. 434. Sklar, A., 1959. Fonctions de re´partition a` n dimensions et leursmarges, vol. 8. Publ. Inst. Stat. Univ, Paris, p. 229e231. Song, S., Singh, V.P., 2010a. Meta-elliptical copulas for drought frequency analysis of periodic hydrologic data. Stoch. Environ. Res. Risk Assess. 24 (3), 425e444. https://doi.org/10.1007/ s00477-009-0331-1. Song, S., Singh, V.P., 2010b. Frequency analysis of droughts using the Plackett copula and parameter estimation by genetic algorithm. Stoch. Environ. Res. Risk Assess. 24 (5), 783e805. https://doi.org/10.1007/s00477-010-0364-5. Stedinger, J.R., Vogel, R.M., Lee, S.U., Batchelder, R., 2008. Appraisal of the generalized likelihood uncertainty estimation (GLUE) method. Water Resour. Res. 44, W00B06. https:// doi.org/10.1029/2008WR006822. Steuer, R., 2006. On the analysis and interpretation of correlations in metabolomic data. Briefings Bioinf. 7 (2), 151e158. https://doi.org/10.1093/bib/bbl009. Steuer, R., Kurths, J., Daub, C.O., Weise, J., Selbig, J., 2002. The mutual information: detecting and evaluating dependencies between variables. Bioinformatics 18, 231e240. https://doi.org/ 10.1093/bioinformatics/18.suppl_2.S231. Thirumalaiah, K., Deo, M.C., 2000. Hydrological forecasting using neural networks. J. Hydrol. Eng. ASCE 5 (2), 180e189. https://doi.org/10.1061/(ASCE)1084-0699(2000)5:2(180). Vergni, L., Todisco, F., Di Lena, B., Mannocchi, F., 2020. Bivariate analysis of drought duration and severity for irrigation planning. Agric. Water Manag. 229, 105926. https://doi.org/ 10.1016/j.agwat.2019.105926. Wang, Y., Yue, J., Liu, S., et al., 2018. Copula entropy coupled with wavelet neural network model for. Hydrol. Predict. 113 (1) https://doi.org/10.1088/1755-1315/113/1/012160. Ward S R., 1979. Deep’n as it Come: The 1927 Mississippi River Flood. By Pete Daniel. New York: Oxford University Press, vol. 1977. 162 pp. Softbound, Oral Hist. Rev. 7(1),86-87. Wei, T., Song, S., 2018. Copula-based composite likelihood approach for frequency analysis of short annual precipitation records. Nord. Hydrol. 49 (5), 1498e1512. https://doi.org/10.2166/ nh.2017.033. Xu, Y., 2005. Applications of Copula-Based Models in Portfolio Optimization. Ph.D. Dissertation. University of Miami, Florida. Yan, B., Guo, S., Chen, L., 2013. Estimation of reservoir flood control operation risks with considering inflow forecasting errors. Stoch. Environ. Res. Risk Assess. https://doi.org/10. 1007/s00477-013-0756-4. Yin, H.F., Li, C.A., 2001. Human impact on floods and flooddisasters on the Yangtze River. Geomorphology 41, 105e109. https://doi.org/10.1016/S0169-555X(01)00108-8. Zhang, J.Y., Hall, M.J., 2004. Regional flood frequency analysis for the Gan-Ming River basin in China. J. Hydrol. 296 (4), 98e117. https://doi.org/10.1016/j.jhydrol.2004.03.018. Zhang, L., Singh, V.P., 2006. Bivariate flood frequency analysis using the copula method. J. Hydrol. Eng. 11 (2), 150e164. https://doi.org/10.1061/(ASCE)1084-0699(2006)11:2(150). Zhang, L., Singh, V.P., 2007. GumbeleHougaard copula for trivariate rainfall frequency analysis. J. Hydrol. Eng. 12 (4), 409e419. https://doi.org/10.1061/(ASCE)1084-0699(2007)12:4(409). Zhang, L., Singh, V.P., 2019. Copulas and Their Applications in Water Resources Engineering. Cambridge University Press, Cambridge, England, p. 603. https://doi.org/10.1061/(ASCE) HE.1943-5584.0001824.

368 Advances in Streamflow Forecasting Zhang, J.H., Chen, L., Singh, V.P., Cao, H.W., Wang, D.W., 2015. Determination of the distribution of flood forecasting error. Nat. Hazards 75 (2), 2065-2065. https://doi.org/10.1007/s11069014-1385-z. Zhao, N., LinWT, 2011. A copula entropy approach to correlation measurement at the country level. Appl. Math. Comput. 218 (2), 628e642. https://doi.org/10.1016/j.amc.2011.05.115. Zhao, T., Cai, X., Yang, D., 2011. Effect of streamflow forecast uncertainty on real-time reservoir operation. Adv. Water Resour. 34 (4), 495e504. https://doi.org/10.1016/j.advwatres.2011.01.004. Zhao, T., Zhao, J., Yang, D., Wang, H., 2013. Generalized martingale model of the uncertainty evolution of streamflow forecasts. Adv. Water Resour. 57, 41e51. https://doi.org/10.1016/ j.advwatres.2013.03.008.

Appendix 1

Books and book chapters on data-driven approaches

Adams, T.E., Pagano, T.C., 2016. Flood Forecasting, A Global Perspective. Elsevier, The Netherlands, p. 478. Govindaraju, R.S., Rao, A.R. (Eds.), 2013. Artificial Neural Networks in Hydrology, Volume 36. Springer Science & Business Media, p. 332. Hipel, K.W., Fang, L. (Eds.), 2013. Stochastic and Statistical Methods in Hydrology and Environmental Engineering. Effective Environmental Management for Sustainable Development, Volume 10. Springer Science & Business Media, p. 372. Volume 4. Hipel, K.W., McLeod, A.I., Panu, U.S., Singh, V.P., 1993. Time series analysis in hydrology and environmental engineering. In: Stochastic and Statistical Methods in Hydrology and Environmental Engineering, Volume 3. Water Science and Technology Library, Springer Science & Business Media, B.V. Knight, D., Shamseldin, A. (Eds.), 2005. River Basin Modelling for Flood Risk Mitigation. CRC Press, p. 608. Kraijenhoff, D.A., Moll, J.R. (Eds.), 2012. River Flow Modelling and Forecasting, Volume 3. Springer Science & Business Media, p. 384. Machiwal, D., Jha, M.K., 2012. Hydrologic Time Series Analysis: Theory and Practice. Capital Publishing Company, New Delhi, India and Springer, Germany, p. 303. Riggs, H.C., 1985. Streamflow characteristics. In: Developments in Water Science Series, Volume 22. Elsevier, p. 248. Salas, J.D., 1993. Analysis and modeling of hydrologic time series. In: Maidment, D.R. (Ed.), Handbook of Hydrology. McGraw-Hill, Inc., USA, 19.1e19.72. Schumann, G.J.-P., Bates, P.D., Apel, H., Aronica, G.T., 2018. Global Flood Hazard: Applications in Modeling, Mapping, and Forecasting. Wiley, p. 272. Singh, V.P., Kumar, B., 1993. Surface-water hydrology. In: Proceedings of the International Conference on Hydrology and Water Resources. Springer Netherlands, New Delhi, India, p. 608. Sivakumar, B., Berndtsson, R., 2010. Advances in Data-Based Approaches for Hydrologic Modeling and Forecasting. World Scientific, Singapore, p. 544. Szilagyi, J., Szollosi-Nagy, A., 2010. Recursive Streamflow Forecasting : A State Space Approach. CRC Press, London, U.K., p. 212 Tayfur, G., 2012. Soft Computing in Water Resources Engineering: Artificial Neural Networks, Fuzzy Logic and Genetic Algorithms. WIT Press, Southampton, Boston, p. 267.

369

370 Books and book chapters on data-driven approaches Varoonchotikul, P., 2003. Flood Forecasting Using Artificial Neural Networks. A.A. Balkema Publishers, The Netherlands, p. 102. Wang, P., 2006. Stochasticity, Nonlinearity and Forecasting of Streamflow Processes. Delft University Press, Amsterdam, The Netherlands, p. 224. Yang, X.S., Gandomi, A.H., Talatahari, S., Alavi, A.H. (Eds.), 2013. Metaheuristics in Water, Geotechnical and Transport Engineering. Elsevier, p. 504.

Appendix 2

List of peer-reviewed journals on data-driven approaches

S. No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Name of Journal Acta Geophysica Applied Artificial Intelligence: An Int. Journal Applied Soft Computing Journal Arabian Journal of Geosciences Computational Intelligence Earth Science Informatics Engineering Applications of Computational Fluid Mechanics Environmental Modelling & Software Expert Systems with Applications Geophysical Research Letters Hydrological Processes Hydrological Sciences Journal Hydrology and Earth System Sciences Hydrology Research

18

International Journal of Forecasting Journal of Earth System Science Journal of Environmental Management Journal of Geophysical Research

19

Journal of Hydroinformatics

20 21 22 23

Journal of Hydrology Journal of Hydraulic Engineering Journal of Hydrologic Engineering Journal of Water Resources Planning and Management Mathematics and Computers in Simulation Mathematical and Computer Modelling

24 25

Publisher Springer Taylor & Francis Elsevier Springer Wiley-Blackwell Springer Taylor & Francis Elsevier Elsevier Wiley John Wiley & Sons, Ltd Taylor & Francis Copernicus Publications International Water Association Publishing Elsevier Springer Elsevier American Geophysical Union (United States) International Water Association Publishing Elsevier American Society of Civil Engineers American Society of Civil Engineers American Society of Civil Engineers Elsevier Elsevier

371

372 List of peer-reviewed journals on data-driven approaches

26 27 28 29 30 31 32 33 34 35 36 37 38 39

Natural Resource Modeling Neurocomputing Neural Computing and Applications Neural Network Nonlinear Processes in Geophysics Physics and Chemistry of the Earth, Parts A/B/C River Research and Applications Scientific Reports Soft Computing Stochastic Environmental Research and Risk Assessment Stochastic Hydrology and Hydraulics Water Water Resources Management Water Resources Research

Wiley Elsevier Springer Elsevier Copernicus Publications Elsevier John Wiley & Sons, Ltd Nature Springer Springer Springer MDPI Springer American Geophysical Union (United States)

372

Appendix 3

Data and software Web resources for open data sources of streamflow l

l

l

l

l

l

https://waterdata.usgs.gov/nwis/sw (Surface water data such as gauge height (stage) and streamflow (discharge) are available for the United States) https://indiawris.gov.in/wris/#/ (Streamflow data are available at different timescales for the reservoirs located in India) https://www.cgd.ucar.edu/cas/catalog/surface/ (Monthly river flow datasets for the world’s largest 925 rivers, plus long-term mean river flow rates and continental discharge into the individual and global oceans are available) https://global-surface-water.appspot.com/download (The Global Surface Water data are available to download in tiles 10 x10 in .tif file) https://www.bafg.de/GRDC/EN/02_srvcs/21_tmsrs/riverdischarge_node. html (Streamflow datasets are available on request for global river basins) https://floodobservatory.colorado.edu/DischargeAccess.html (Daily discharge data are available for US river basins)

Software packages for streamflow modeling and forecasting l l

l l l l l

R (Statistical software) HEC-HMS (Hydrologic Engineering Center’s Hydrologic Modeling System) HEC-RAS (Hydrologic Engineering Center’s River Analysis System) MATLAB MINITAB SWMM (Storm Water Management Model) TUFLOW (Two-dimensional Unsteady Flow)

373

Index Note: ‘Page numbers followed by “f ” indicate figures and “t” indicate tables.’

A Adaptive neuro-fuzzy inference system (ANFIS), 6e7, 172e173, 305e306 artificial intelligence (AI) models, 172e173 artificial neural network (ANN), 172e174, 177e180 autoregressive moving average (ARMA), 172e173 backpropagation (BP), 177e180 coefficient of determination, 182 descriptive statistics, 179te180t fuzzy logic (FL) models, 172e173 gene expression programming (GEP), 172e173 grid partitioning (GP), 180e182 KaisereMeyereOlkin (KMO), 177 low-order drainage streams, 171 NasheSutcliffe efficiency (NSE), 182 performance assessment, 185t, 187t principal component analysis (PCA), 172e173, 177 rainfall-runoff modeling, 171e173 root mean square error (RMSE), 182 sensitivity analysis, 182e183 subtractive clustering (SC), 180e182 wavelet-ANN model, 172e173 Akaike Information Criterion (AIC), 93 Antalya Basin, 202e203 Appropriate input variables, gamma test, 319e321 Artificial intelligence (AI), 7, 20, 172e173, 305e306 Artificial neural network (ANN), 4e5, 10e20, 17te19t, 173e174, 177e180, 193, 216, 240e246, 265, 305e306, 331e332, 338e347 adaptive neuro-fuzzy inference system (ANFIS). See Adaptive neuro-fuzzy inference system (ANFIS) architecture, 118e120, 120f autocorrelation function (ACF), 159e160

backpropagation algorithm (BPA), 123, 125 backpropagation through time (BPTT), 155e156 bias variance trade-off, 125f connection formula, 117e118 convolutional neural network, 156, 157f data-driven, 115e116 data normalization techniques, 127 data postprocessing, 129e130 data preprocessing, 126e128 decimal scaling normalization, 127 dimensionality reduction, 122f feedforeword neural network (FFNN), 152e154 function approximation, 126 gated recurrent unit (GRU), 156 GRNN model, 134 knowledge driven, 115e116 long short-term memory network, 155e156 long-term and short-term timescales, 137te141t minimum-maximum normalization technique, 127 multilayer perceptron neural network, 130, 152e154 network architecture, 128e129 network training process, 120e125 neurons formula, 117e118 nonlinearity approaches, 116 principal component analysis (PCA), 126e128 recurrent neural network (RNN), 150, 155 second hidden layer, 162, 163te164t self-adaptive approaches, 116 static and dynamic neural network, 130e132 statistical neural networks, 132e134 streamflow forecasting, 157e159 supervised training method, 123e125 transfer function, 118, 119t

375

376 Index Artificial neural network (ANN) (Continued ) unsupervised training method, 121e123 WaveNet, 156e157 Z-score normalization, 127 Autocorrelation function (ACF), 93, 159e160, 204e206, 206f, 282e283, 291f Auto regressive integrated moving average (ARIMA), 53, 55te56t Auto regressive moving average (ARMA), 5e6, 51 Average mutual information (AMI), 221e222 Averaging multiclimate model prediction of streamflow artificial neural networks (ANNs) modeling, 240e246 empirical mode decomposition (EMD), 244e245 global climate models (GCMs), 239e240 machine learning methods, 240, 249e254, 251f model selection approach, 239e240 neural network framework, 246e247, 247fe248f plausible streamflow ANN-based prediction, 245 reasonable streamflow forecasts, 243e244 regional climate models (RCMs), 239e240 runoff prediction, 243e244 standardized precipitation index (SPI), 240 superior monthly streamflow forecasting, 245e246 support vector regression (SVR) modeling, 240e246, 248e249, 248f, 252t wavelet neural networks (WNN), 244e245

B Backpropagation (BP), 177e180 Backpropagation algorithm (BPA), 123, 125 Backpropagation through time (BPTT), 155e156 Best-fit model selection, 232, 233f Bias variance trade-off, 125f

C Classical genetic programming, 195e197, 195f Coefficient of determination, 182 Coefficient of efficiency (CE), 27 Conceptual models, 4e5 Connection formula, 117e118

Constricted flow methods, 2e3 Convolutional neural network, 156, 157f Copula entropy method application, 344e347 artificial neural network (ANN) modeling, 331e332, 335e336, 338e347 copula function, 337e338 entropy theory, 336e337 flood forecasts uncertainties, 347e353 fitting errors, 351 simulation, 354e360 linear correlation coefficient (LCC), 333 mutual information (MI) method, 333e334 partial mutual information, 342e343 performance evaluation, 345 Sacramento soil moisture accounting (SAC-SMA) model, 331e332 Cross-correlation function (CCF), 93, 221e222 Crossover operation, 195e196, 196f

D Data-driven models, 306 Data normalization techniques, 127 Data postprocessing, 129e130 Data preprocessing, 126e128 Data proportioning, 231, 231f Decimal scaling normalization, 127 Deep neural network (DNN) model, 246 Diagonal VECH model, 94e95 Differential evolution (DE), 282e283 Dimensionality reduction, 122f Direct measurement methods, 2e3 Discrete wavelet transform (DWT), 282 Distributed process-based models, 215e216 Dynamic neural network (DNN), 131

E Empirical mode decomposition (EMD), 244e245, 282 Empirical/statistical models, 215 Enhanced extreme learning machine (EELM), 282e283 Exponential smoothing (ETS), 51, 53e54, 60 Extreme learning machine (ELM), 264e267, 306 adaptive neuro-fuzzy inference system (ANFIS), 282

Index autocorrelation function (ACF), 282e283, 291f differential evolution (DE), 282e283 discrete wavelet transform (DWT), 282 empirical mode decomposition (EMD), 282 enhanced extreme learning machine (EELM), 282e283 generalized regression neural network (GRNN) model, 282e283 genetic algorithm (GA), 282e283 Hilbert transform (HT), 282 hybrid model integrating, 282e283 least square support vector machine (LSSVM), 282e283 M-EMDSVR model, 282e283 multiple linear regression (MLR) model, 284e287 multivariate adaptive regression spline (MARS), 282e283 NasheSutcliffe efficiency (NSE), 282e283 optimally pruned extreme learning machine (OPELM), 282e283 outlier-robust extreme learning machine (ORELM), 284 partial autocorrelation function (PACF), 282e283 particle swarm optimization (PSO), 282e283 regularized extreme learning machine (RELM), 284 self-exciting threshold autoregressive (SETAR) models, 282e283 support vector regression (SVR), 282e283 variational mode decomposition (VMD), 282e283 weighted regularized extreme learning machine (WRELM), 284

F False nearest neighbor (FNN), 221e222 Feedforeword neural network (FFNN), 152e154 Flood forecasts uncertainties, 347e353 fitting errors, 351 Function approximation, 126 Fuzzy logic (FL) models, 172e173

377

G Gamma test (GT), 316e318 Gated recurrent unit (GRU), 156 Gene expression programming (GEP), 172e173, 198e199, 198f, 202f General circulation models (GCM), 89e90 Generalized regression neural network (GRNN), 128, 134, 282e283, 306 Genetic algorithm (GA), 282e283 Genetic programming (GP), 305e306 artificial neural networks (ANNs), 193 classical genetic programming, 195e197, 195f crossover operation, 195e196, 196f gene expression programming (GEP), 198e199, 198f, 202f linear genetic programming (LGP), 197 models evaluating performance, 204 multigene genetic programming (MGGP), 197 mutation, 196 optimal number of input vectors, 204e206 partial autocorrelation function (PACF), 204e206 soft computing (SC) tools, 193 streamflow process modeling, 194 symbolic system identification, 193 time series modeling, 200te201t univariate modeling, 194 Global climate models (GCMs), 239e240 Global models (GMs), 264 Gray-box models, 4e5, 5f Gray wolf optimizer (GWO) algorithm, 309e310 Grid partitioning (GP), 180e182

H Hilbert transform (HT), 282 Hourly flood forecasting, AI techniques in, 268e269 Hybrid artificial intelligence (AI) models, 321e324 adaptive neuro-fuzzy inference system (ANFIS), 305e306 appropriate input variables, gamma test, 319e321 artificial intelligence (AI) method, 305e306 artificial neural network (ANN), 305e306 data-driven models, 306 extreme learning machine (ELM), 306 gamma test (GT), 316e318

378 Index Hybrid artificial intelligence (AI) models (Continued ) generalized regression neuralnetwork (GRNN), 306 genetic programming (GP), 305e306 Gray wolf optimizer (GWO) algorithm, 309e310 hybrid AI models, 321e324 hybrid multilayer perceptron (MLP) neural network modeling, 312, 313f hybrid SVR model, 313, 313f multilayer perceptron (MLP) neural network model, 309 multiple linear regression, 318 particle swarm optimization (PSO), 306e307 performance evaluation indicators, 318e319 support vector machines (SVM), 305e306 support vector regression (SVR), 306e308 wavelet-extreme learning machine (WA-ELM), 306 whale optimization algorithm (WOA), 306e307, 310e312 Hybrid data-driven techniques, 20e21, 22te23t Hybrid model integrating, 282e283 Hybrid multilayer perceptron (MLP) neural network modeling, 312, 313f Hybrid SVR model, 313, 313f

K KaisereMeyereOlkin (KMO), 177 K-nearest neighbors (KNN), 27

L Least square support vector machine (LSSVM), 282e283 Least square support vector regression (LSSVR) models, 265e266 Linear correlation coefficient (LCC), 333 Linear genetic programming (LGP), 197 LjungeBox test, 93 Long short-term memory (LSTM), 151, 155e156 Lumped conceptual models, 215e216

M Machine learning methods, 240, 249e254, 251f

Mathematical/process-based models, 215 Mean absolute error (MAE), 27, 67, 223e224 Mean squared error (MSE), 27 Median absolute error (MdAE), 67 M-EMDSVR model, 282e283 MGARCH approach, 100e104, 103fe104f MGARCH model, 94e95 Akaike Information Criterion (AIC), 93 autocorrelation functions (ACFs), 93 cross correlation functions (CCFs), 93 diagonal VECH model, 94e95 general circulation models (GCM), 89e90 testing conditional heteroscedasticity, 95 Minimum-maximum normalization technique, 127 M5 model tree, 268 Model configurations, 225, 227t, 228e231 Model selection approach, 239e240 Models evaluating performance, 204 Models performances comparative evaluation, 104e110, 107fe108f Model tree technique applications, 218e219 artificial neural networks (ANNs), 216 average mutual information (AMI), 221e222 best-fit model selection, 232, 233f cross-correlation function (CCF), 221e222 data proportioning, 231, 231f distributed process-based models, 215e216 empirical/statistical models, 215 false nearest neighbor (FNN), 221e222 lumped conceptual models, 215e216 mathematical/process-based models, 215 mean absolute error (MAE), 223e224 model calibration, 228 model configuration, 225, 227t model configurations, 228e231 model tree variants, 230, 230f Nash-Sutcliffe efficiency (NSE), 223e224 physical-scaled models, 215 root mean squared error (RMSE), 223e224 sensitivity analysis, 228e231 standard deviation reduction (SDR), 216e218 Model tree variants, 230, 230f Modular models (MMs), 264 MooreePenrose generalized inverse function, 267 Multigene genetic programming (MGGP), 197

Index Multilayer perception (MLP), 128, 152e154 Multilayer perceptron (MLP) neural network model, 130, 309 Multiple linear regression (MLR), 242, 284e287, 318 Multiple/multivariate generalized autoregressive conditional heteroscedasticity (MGARCH) approach, 88 Multiple/multivariate linear time series models Akaike Information Criterion (AIC), 93 auto correlation functions (ACFs), 93 cross correlation functions (CCFs), 93 general circulation models (GCM), 89e90 LjungeBox test, 93 model building procedure, 93 models performances comparative evaluation, 104e110, 107fe108f multiple/multivariate generalized autoregressive conditional heteroscedasticity (MGARCH) approach, 88, 94e95, 100e104, 103fe104f multivariate autoregressive (MAR) model, 89e90 rainfall-runoff process, 91f standardized precipitation index (SPI) time series, 89e90 VAR model, 97e100, 99t VARXeMGARCH model, 89e90 VARX model, 100, 102t vector autoregressive without/with exogenous variables (VAR/VARX) approach, 88, 90e93 Multivariate adaptive regression spline (MARS), 282e283 Multivariate autoregressive (MAR) model, 89e90 Mutation, 196 Mutual information (MI) method, 333e334

N Naı¨ve method, 62 NasheSutcliffe efficiency (NSE), 182, 223e224, 282e283 Network architecture, 128e129 Network training process, 120e125 Neural network framework, 246e247, 247fe248f

379

Neurons formula, 117e118 Noncontact measurement methods, 2e3 Nonlinearity approaches, 116

O Optimally pruned extreme learning machine (OPELM), 27, 282e283 Outlier-robust extreme learning machine (ORELM), 284

P Partial autocorrelation function (PACF), 204e206, 206f, 282e283 Partial mutual information, 342e343 Particle swarm optimization (PSO), 7e8, 282e283, 306e307 Performance evaluation indicators, 318e319 Physical-based models, 4e5 Physical-scaled models, 215 Plausible streamflow ANN-based prediction, 245 Principal component analysis (PCA), 126e128, 172e173, 177

R Radial basis function (RBF), 128, 242 Rainfall-runoff modeling, 171e173 Rainfall-runoff process, 91f Reasonable streamflow forecasts, 243e244 Rectified linear unit (ReLU), 246 Recurrent neural network (RNN), 131, 155 Regional climate models (RCMs), 239e240 Regularized extreme learning machine (RELM), 284 Root mean squared error (RMSE), 182, 204, 223e224 Runoff prediction, 243e244, 314e319

S Sacramento soil moisture accounting (SAC-SMA) model, 331e332 Second hidden layer, 162, 163te164t Selected model inputs, 346e347 Self-adaptive approaches, 116 Self-exciting threshold autoregressive (SETAR) models, 282e283 Sensitivity analysis, 182e183, 228e231 Short-term flood forecasting artificial neural networks (ANN), 265

380 Index Short-term flood forecasting (Continued ) extreme learning machine (ELM) model, 264 extreme learning machines (ELM), 265e267 global models (GMs), 264 hourly flood forecasting, AI techniques in, 268e269 M5 model tree, 268 modular models (MMs), 264 single-layer feedforward neural network (SLFN), 264 support vector machine (SVR), 264 Simple exponential smoothing (SES) models, 53 Single-layer feedforward neural network (SLFN), 264 Singular spectrum analysis (SSA), 7e8 Soft computing (SC), 193 Standard deviation reduction (SDR), 216e218, 268 Standardized precipitation index (SPI), 89e90, 240 Static and dynamic neural network, 130e132 Statistical models autoregressive integrated moving average (ARIMA), 53, 55te56t forecasting with, 59e60 autoregressive moving average (ARMA), 51 exponential smoothing models, 51, 53e54, 60 forecasting, 52e56 exponential smoothing models, 58t large-scale applications, two timescales, 60e77, 61f mean absolute error (MAE), 67 median absolute error (MdAE), 67 multistep ahead forecasting annual streamflow, 270 time series, 62e70, 64fe65f, 67fe69f monthly streamflow, 270 time series, 70e77, 73fe74f, 76f naı¨ve method, 62 simple exponential smoothing (SES) models, 53 Statistical neural networks, 132e134 Streamflow forecasting adaptive neuro-fuzzy inference system (ANFIS), 6e7 artificial intelligence (AI)-based techniques, 7

artificial neural network (ANN), 4e5 auto regressive moving average (ARMA), 5e6 coefficient of efficiency (CE), 27 conceptual models, 4e5 conceptual/physically based models, 4e5 constricted flow methods, 2e3 current trends, 28e29 data-driven methods AI techniques, 20 artificial neural network (ANN), 10e20, 17te19t hybrid data-driven techniques, 20e21, 22te23t time series modeling, 8e10, 11te14t data-driven models, 2, 4e5 direct measurement methods, 2e3 gray-box models, 4e5, 5f key challenges, 29e31 K-nearest neighbors (KNN), 27 mean absolute error (MAE), 27 mean squared error (MSE), 27 noncontact measurement methods, 2e3 optimally pruned extremelearning machine (OP-ELM), 27 particle swarm optimization (PSO), 7e8 physical-based models, 4e5 singular spectrum analysis (SSA), 7e8 support vector machine (SVM), 7e8 techniques/models used, 4e8 velocity-area methods, 2e3 white-box models, 4e5, 5f Streamflow process modeling, 194 Subexpressions (sub-ET), 198 Subtractive clustering (SC), 180e182 Superior monthly streamflow forecasting, 245e246 Supervised training method, 123e125 Support vector machine (SVM), 7e8, 201, 264, 305e306 Support vector regression (SVR), 240e246, 248e249, 248f, 252t, 282e283, 306e308 Symbolic system identification, 193

T Tapped delay line (TDL), 128 Testing conditional heteroscedasticity, 95 Time series modeling, 8e10, 11te14t, 200te201t

Index Traditional hydrological models, 134e135 Training dataset, 269 Transfer function, 118, 119t

U Univariate modeling, 194 Unsupervised training method, 121e123

V Variational mode decomposition (VMD), 282e283 VARXeMGARCH model, 89e90 Vector autoregressive (VAR), 97e100, 99t Vector autoregressive with exogenous variables (VARX model), 100, 102t Vector autoregressive without/with exogenous variables

(VAR/VARX) approach, 88, 90e93 Velocity-area methods, 2e3

W Wavelet-ANN model, 172e173, 201 Wavelet-extreme learning machine (WA-ELM), 306 Wavelet neural networks (WNN), 244e245 WaveNet, 156e157 Weighted regularized extreme learning machine (WRELM), 284 Whale optimization algorithm (WOA), 306e307, 310e312 White-box models, 4e5, 5f

Z Z-score normalization, 127

381