日本地球惑星科学連合2021年大会

講演情報

[J] 口頭発表

セッション記号 A (大気水圏科学) » A-CG 大気海洋・環境科学複合領域・一般

[A-CG43] 地球環境科学と人工知能/機械学習

2021年6月3日(木) 13:45 〜 15:15 Ch.07 (Zoom会場07)

コンビーナ:冨田 智彦(熊本大学大学院 先端科学研究部)、細田 滋毅(国立研究開発法人海洋研究開発機構)、福井 健一(大阪大学)、小野 智司(鹿児島大学)、座長:冨田 智彦(熊本大学大学院 先端科学研究部)、細田 滋毅(国立研究開発法人海洋研究開発機構)

15:00 〜 15:15

[ACG43-06] Deep learning approach for rainfall forecasting using U-Net with data augmentation

*金子 凌1、小野村 史穂2、仲吉 信人2 (1.東京大学、2.東京理科大学)

キーワード:降水予測、レーダー雨量、深層学習、豪雨、災害

Recently, heavy rainfall disaster occurs almost every year in Japan. The Japan meteorological agency (JMA) expects the rainfall to become severer and more frequent in the future. In those situations, predicting precipitation accurately in the short term is essential to leave enough time for people to evacuate to safer places. In recent years, deep learning technology has attracted attention for its achievements in various fields such as autonomous cars and machine translation. Although deep learning is gradually being applied to precipitation predictions, there is no example of its extension to heavy rainfall prediction to the best of our knowledge. In this study, we aimed to develop a deep learning model that can predict heavy rainfall events.

We trained U-Net (Ronneberger et al., 2015), a deep learning architecture for semantic segmentation, with the nationwide Radar-AMeDAS precipitation (RAP) data produced by JMA, which spatial resolution is 1 km and temporal resolution is 30 minutes. We divided the data into 13 regions of 256 km square because the nationwide data is too large for learning. The input data is the continuous variables of rainfall intensity for 6 hours up to the current time (30-minute intervals), and the teacher data is the categorized rainfall intensity for the next 6 hours (1-hour intervals). The output category was determined referring to the JMA’s precipitation categories: Cat.0 denotes no precipitation, i.e., less than 0.1 mm h-1, Cat.1 0.1 - 10 mm h-1, Cat.2 10 - 30 mm h-1, Cat.3 30 - 50 mm h-1, and Cat.4 over 50 mm h-1. We divided the 13 years’ precipitation data as follows: The data of 2006 - 2012 was prepared for training, 2013 - 2015 was used as validation for checking the learning progress. The data between 2016 and 2018 was for a test in which the prediction accuracy was evaluated.

Furthermore, we applied data augmentation to training data by generating pseudo data. We used affine transformations, such as rotations from -60 degrees to 60 degrees and expansions from 1x to 4x to heavy rain events. We made two calculation cases, CNT with no augmented data and AUG with augmentation. The two cases were compared in terms of prediction accuracy.

Figure1 shows the prediction of the July 2017 heavy rainfall event occurring in the northern Kyushu region from 10 a.m. to 3 p.m. on 5th July. We can find that the CNT model predicted only light rain but unable to predict the heavy rainfall. On the other hand, the AUG model could predict the two linear precipitation systems extending east-west at 6 hours ahead, though it tended to overestimate. It was also found that the accuracy of the AUG’s prediction was higher with the shorter prediction time. Furthermore, we evaluated the forecast results of AUG for the entire test period (Figure2). Even for Cat.4, the critical success index (CSI) exceeded 0.3 at 1 hour ahead, but the false alarm rate (FAR) decreased rapidly with the longer forecast time, and the CSI decreased accordingly. However, the probability of detection (POD) decreased gradually with longer forecast times in all categories. Thus, the model trained with pseudo data could predict some heavy rainfall events six hours prior to the occurrence, even though it has a tendency of overestimation.

We found that data augmentation is a powerful tool for predict low-frequency events such as heavy rainfall and our method of transformation was appropriate to reproduce the real phenomena. On the other hand, there was a tendency of overestimation, which might be partly due to the lack of balance in the sample size during training. For example, the sample size is different among categories and also for prediction time. That leads to a bias toward a specific category during training. For preventing this problem, the method of data augmentation and data balance control has to be explored.