Super-Resolution Data Assimilation Using Denoising Diffusion Probabilistic Models

Kazuya Miyashita; Yuki Yasuda; Ryo Onishi

3:30 PM - 3:45 PM

[MGI26-07] Super-Resolution Data Assimilation Using Denoising Diffusion Probabilistic Models

★Invited Papers

Kazuya Miyashita¹, *Yuki Yasuda¹, Ryo Onishi¹ (1.Tokyo Institute of Technology)

Keywords:Super Resolution, Data Assimilation, Diffusion Model, Deep Generative Model

Data assimilation is a crucial technique for various simulations in Earth sciences. In such simulations, errors, such as those due to unresolved physical processes, accumulate over time. These errors can be reduced by assimilating observational data. Data assimilation can be understood as estimating the probability distribution of the true state from a viewpoint of Bayesian inference. This perspective implies the potential of deep generative models (generative AI), which are effective in approximating probability distributions. The present study demonstrates the use of deep generative models for simultaneous data assimilation and downscaling (i.e., super-resolution), resulting in faster and more accurate inference.

This research employs denoising diffusion probabilistic models (hereafter referred to as diffusion models) [1,2], a core technology in recent generative AI developments. During training, these models are optimized to estimate noise added to the data (Fig. 1). During inference, diffusion models progressively denoise from completely random data to generate high-quality samples. This inference process can be interpreted as transforming a normal distribution into a multimodal distribution (Fig. 2).

We utilize a Super-Resolution Data Assimilation (SRDA, [3,4]) method. In this method [4], time evolution is calculated using a low-resolution physics-based model, while super-resolution and data assimilation are executed using a neural network. This network infers high-resolution time series in space and time by inputting observations up to the current time and low-resolution forecasts, including future time points. Unlike the previous study [4], the use of deep generative models allows for uncertainty quantification in the inference.

We evaluated the SRDA with the diffusion model through twin experiments using an idealized barotropic ocean jet stream [5, 4]. For training, we prepared pairs of low (64 x 32) and high (128 x 64) resolution data through fluid simulations and trained the diffusion model on this dataset. For testing, we performed ultra-high resolution (1024 x 512) fluid simulations, considered as the ground truth. For comparison, an Ensemble Kalman Filter (EnKF) was applied to a high-resolution (128 x 64) fluid model.

The SRDA with the diffusion model demonstrates high accuracy (Fig. 3). Despite relying on low-resolution simulation inputs, the SRDA outperforms the EnKF by 4.3% in accuracy. Moreover, SRDA inference is faster, requiring only about 18% of the computation time compared to the EnKF. This efficiency is mainly because the SRDA does not require ensemble simulations. We also found that the diffusion model can estimate inference uncertainty without ensemble time evolution (Fig. 4).

The results of this study highlight the potential of deep generative models for achieving super-resolution, data assimilation, and uncertainty quantification simultaneously. In future work, we plan to test the SRDA method on more realistic data, such as three-dimensional fluid simulation results.

References:
[1] Sohl-Dickstein et al., ICML, 2015.
[2] Ho et al., NeurIPS, 2020.
[3] Barthelemy et al., Ocean Dyn., 2022
[4] Yasuda and Onishi, J. Adv. Model. Earth Syst., 2023.
[5] David et al., Ocean Model., 2017.

Presentation information

[M-GI26] Data-driven approaches for weather and hydrological predictions

[MGI26-07] Super-Resolution Data Assimilation Using Denoising Diffusion Probabilistic Models

★Invited Papers