3:00 PM - 3:15 PM
[ACG43-06] Deep learning approach for rainfall forecasting using U-Net with data augmentation
Keywords:Precipitation forecasting, Radar precipitation, Deep learning, Heavy rainfall, Disaster
We trained U-Net (Ronneberger et al., 2015), a deep learning architecture for semantic segmentation, with the nationwide Radar-AMeDAS precipitation (RAP) data produced by JMA, which spatial resolution is 1 km and temporal resolution is 30 minutes. We divided the data into 13 regions of 256 km square because the nationwide data is too large for learning. The input data is the continuous variables of rainfall intensity for 6 hours up to the current time (30-minute intervals), and the teacher data is the categorized rainfall intensity for the next 6 hours (1-hour intervals). The output category was determined referring to the JMA’s precipitation categories: Cat.0 denotes no precipitation, i.e., less than 0.1 mm h-1, Cat.1 0.1 - 10 mm h-1, Cat.2 10 - 30 mm h-1, Cat.3 30 - 50 mm h-1, and Cat.4 over 50 mm h-1. We divided the 13 years’ precipitation data as follows: The data of 2006 - 2012 was prepared for training, 2013 - 2015 was used as validation for checking the learning progress. The data between 2016 and 2018 was for a test in which the prediction accuracy was evaluated.
Furthermore, we applied data augmentation to training data by generating pseudo data. We used affine transformations, such as rotations from -60 degrees to 60 degrees and expansions from 1x to 4x to heavy rain events. We made two calculation cases, CNT with no augmented data and AUG with augmentation. The two cases were compared in terms of prediction accuracy.
Figure1 shows the prediction of the July 2017 heavy rainfall event occurring in the northern Kyushu region from 10 a.m. to 3 p.m. on 5th July. We can find that the CNT model predicted only light rain but unable to predict the heavy rainfall. On the other hand, the AUG model could predict the two linear precipitation systems extending east-west at 6 hours ahead, though it tended to overestimate. It was also found that the accuracy of the AUG’s prediction was higher with the shorter prediction time. Furthermore, we evaluated the forecast results of AUG for the entire test period (Figure2). Even for Cat.4, the critical success index (CSI) exceeded 0.3 at 1 hour ahead, but the false alarm rate (FAR) decreased rapidly with the longer forecast time, and the CSI decreased accordingly. However, the probability of detection (POD) decreased gradually with longer forecast times in all categories. Thus, the model trained with pseudo data could predict some heavy rainfall events six hours prior to the occurrence, even though it has a tendency of overestimation.
We found that data augmentation is a powerful tool for predict low-frequency events such as heavy rainfall and our method of transformation was appropriate to reproduce the real phenomena. On the other hand, there was a tendency of overestimation, which might be partly due to the lack of balance in the sample size during training. For example, the sample size is different among categories and also for prediction time. That leads to a bias toward a specific category during training. For preventing this problem, the method of data augmentation and data balance control has to be explored.