*A T M Sakiur Rahman1, Takahiro Hosono4,5, John Quilty2, Jayanta Das3, Amiya Basak3
(1.Postdoctoral researcher, Kumamoto University, 2.Department of Civil and Environmental Engineering , University of Waterloo, Waterloo, ON, Canada , 3.Department of Geography and Applied Geography, University of North Bengal, Darjeeling - 734013, India , 4.Faculty of Advanced Science and Technology, Kumamoto University, 2-39-1 Kurokami, Kumamoto 860-8555, Japan, 5.International Research Organization for Advanced Science and Technology, Kumamoto University, 2-39-1 Kurokami, Kumamoto 860-8555, Japan)
Keywords:Automated Hybrid Machine Learning, Groundwater level forecasting, Kumamoto, Japan
Groundwater level (GWL) forecasting is crucial task for planning and management of water resources. However, accurate GWL forecasting is challenging due to nonlinear relationships between GWL and driving hydro-meteorological (input) variables (e.g., rainfall, air temperature). One of the pre-requisites for developing accurate machine learning (ML) models is to select relevant input variables from a range of candidates and afterwards optimizing the model parameters. To address these problems, this study tested the ability of ML approaches such as Random Forests (RF) and eXtreme Gradient Boosting (XGB), which can select input variable internally during model development, for GWL forecasting. Afterwards, RF and XGB were compared with Support Vector Regression (SVR), which has a history of high performance for GWL forecasting. Further, RF, XGB, and SVR models were fed with multiscale information obtained through wavelet transforms (WT) to develop new hybrid ML approaches (WT-RF, WT-XGB, and WT-SVR) for GWL forecasting in Kumamoto, Japan. ML approaches were also coupled with Bayesian hyper-parameter optimization (BHO) to estimate the model parameters automatically. The capability of the models was tested for different monthly lead times (1, 2, and 3 months-ahead) GWL forecasting. The obtained results revealed that standalone ML approaches can successfully forecast the GWL across all lead times and revealed that XGB and SVR had very similar accuracy, while RF showed higher error in Kumamoto, Japan. Although the accuracy was similar for SVR and XGB, it is necessary to apply external algorithms for proper selection of input variables prior to develop model using the SVR. Further, the WT-based models showed improved performance (3-5%) over the standalone models. Therefore, coupling BHO, WT, and ML is a new promising framework for GWL forecasting and is recommended to be explored for short and long-term forecasting of other hydro-meteorological variables (e.g., streamflow, evaporation).