5:15 PM - 7:15 PM
[AOS19-P04] Performance evaluation of statistical downscaling of storm surge along the coast of Japan

Keywords:Statistical downscaling, Coastal sea level, CMIP5
Understanding future changes in coastal sea levels due to global warming is important due to their significant impact on coastal environments and societies. Coastal sea level variations in short timescales of a few days will also change in the future as well as long-term trend. To understand such a future change of short-term variations, statistical downscaling is one of the effective ways. In this study, we investigated whether a statistical model trained using observation data could correctly estimate the daily maximum sea levels along the coast of Japan in climate models. First, to estimate the daily maximum sea levels along the Japanese coast, two statistical models (linear regression and random forest) are trained independently using tide-gauge data and an atmospheric reanalysis product. Then we evaluated the performance of these statistical models by comparing the estimated sea levels of these statistical models and the coastal sea levels of an ultra-high resolution ocean model, in which the predictors are the same atmospheric forcing data for the ocean model.
To train the statistical models, an empirical orthogonal functions (EOF) analysis was performed for daily atmospheric values of the JRA-55 reanalysis product (sea level pressure, total precipitation, and 10 m zonal and meridional wind speeds) after removing seasonal components. The period is 1958-2020, and the domain is for the 8°×8° area centered on each nine tide gauges along the Japanese coast. Next, statistical models were trained using the EOF time series as the predictor and the difference between the observed daily maximum sea levels and the monthly sea level as the predictand. The performance of the statistical model was evaluated using the RCP8.5 experiment (2086-2100) and Historical experiment (1991-2005) of the 2km future projection data around Japan (FORP-JPN02 version 2; FORP; Nishikawa et al., 2021), which was forced by atmospheric data created from the four CMIP5 models. For each location of the tide gauges, we calculated the daily time series for the atmospheric variables by projecting them onto the EOF spatial structures obtained from the JRA-55 product. We then fed these time series into the statistical models and added the monthly mean sea levels from FORP to the estimated values. These estimated daily maximum sea levels were compared to the daily maximum sea levels (corrected daily mean sea level) of FORP to evaluate the estimation performance of the statistical models.
The comparison shows that the correlation coefficients between the daily maximum sea levels estimated by the statistical model and FORP exceeded 0.7 in all combinations. The ratio of the root-mean-squared error (RMSE) to the standard deviation of the daily maximum sea levels of the FORP was less than 1 (Figure 1). Hence it can be evaluated that both methods appropriately reproduce the phase and amplitude of the daily maximum coastal sea levels.
Next, we focused on the sea levels above the 90th percentile as the extreme sea levels. The ratio of the RMSE between the FORP extreme sea levels and the estimated extreme sea levels by both statistical models to the standard deviation of the FORP sea levels (RMSE/SD) was less than 1 at all tide-gauges, although the statistical models underestimated the mean values of extreme sea levels except for the Wajima station in the linear regression model. Thus, both statistical models adequately reproduced the amplitude of the extremes. Note that, except for Kashiwazaki, the RMSE/SD averaged over climate models and experiments (Historical and RCP8.5) in linear regression model is smaller than that in the random forest. Therefore, the linear regression model is more accurate than the random forest in assessing the amplitude of the extreme daily maximum sea levels. In the future, by applying these statistical models to CMIP6 data, it will be possible to estimate future changes and uncertainties of storm surges and those differences between scenarios.
To train the statistical models, an empirical orthogonal functions (EOF) analysis was performed for daily atmospheric values of the JRA-55 reanalysis product (sea level pressure, total precipitation, and 10 m zonal and meridional wind speeds) after removing seasonal components. The period is 1958-2020, and the domain is for the 8°×8° area centered on each nine tide gauges along the Japanese coast. Next, statistical models were trained using the EOF time series as the predictor and the difference between the observed daily maximum sea levels and the monthly sea level as the predictand. The performance of the statistical model was evaluated using the RCP8.5 experiment (2086-2100) and Historical experiment (1991-2005) of the 2km future projection data around Japan (FORP-JPN02 version 2; FORP; Nishikawa et al., 2021), which was forced by atmospheric data created from the four CMIP5 models. For each location of the tide gauges, we calculated the daily time series for the atmospheric variables by projecting them onto the EOF spatial structures obtained from the JRA-55 product. We then fed these time series into the statistical models and added the monthly mean sea levels from FORP to the estimated values. These estimated daily maximum sea levels were compared to the daily maximum sea levels (corrected daily mean sea level) of FORP to evaluate the estimation performance of the statistical models.
The comparison shows that the correlation coefficients between the daily maximum sea levels estimated by the statistical model and FORP exceeded 0.7 in all combinations. The ratio of the root-mean-squared error (RMSE) to the standard deviation of the daily maximum sea levels of the FORP was less than 1 (Figure 1). Hence it can be evaluated that both methods appropriately reproduce the phase and amplitude of the daily maximum coastal sea levels.
Next, we focused on the sea levels above the 90th percentile as the extreme sea levels. The ratio of the RMSE between the FORP extreme sea levels and the estimated extreme sea levels by both statistical models to the standard deviation of the FORP sea levels (RMSE/SD) was less than 1 at all tide-gauges, although the statistical models underestimated the mean values of extreme sea levels except for the Wajima station in the linear regression model. Thus, both statistical models adequately reproduced the amplitude of the extremes. Note that, except for Kashiwazaki, the RMSE/SD averaged over climate models and experiments (Historical and RCP8.5) in linear regression model is smaller than that in the random forest. Therefore, the linear regression model is more accurate than the random forest in assessing the amplitude of the extreme daily maximum sea levels. In the future, by applying these statistical models to CMIP6 data, it will be possible to estimate future changes and uncertainties of storm surges and those differences between scenarios.