9:15 AM - 9:30 AM
[MGI27-02] Conditional Deep Diffusion Modeling for GSMaP Inpainting
Keywords:GSMaP, Precipitation map, Diffusion models, Machine learning
The Global Satellite Mapping of Precipitation (GSMaP), issued by the Japan Aerospace Exploration Agency (JAXA), is a satellite-based precipitation retrieval product that integrates multiple satellite observations. Due to the orbital characteristics of polar-orbiting satellites equipped with microwave sensors, continuous global precipitation estimation is infeasible in GSMaP, resulting in substantial missing regions in microwave-sensor-based precipitation estimates. The current GSMaP algorithm addresses these missing regions using transformation equations to interpolate from adjacent observations. However, this approach often introduces spatial discontinuities, where estimated precipitation maps lack spatial continuity with observed regions. This issue arises because the current method prioritizes temporal consistency over spatial continuity.
To overcome these limitations, we propose a machine learning-based approach. Precipitation map inpainting can be formulated as video inpainting, a task in the research field of computer vision, in which missing regions are reconstructed using temporal information from adjacent frames and spatial cues from surrounding areas. Recent studies have framed video inpainting as a conditional generation task using diffusion models, a state-of-the-art generative approach. By training a conditional diffusion model with a 3D U-Net, which learns spatio-temporal features from paired incomplete and complete video samples, the diffusion model reconstructs fully inpainted videos from unseen data with missing regions. Furthermore, because the model is trained end to end, it eliminates the need for manually designing complex inpainting algorithms.
Our model consists of a 3D U-Net and a 3D condition encoder. The 3D U-Net learns the reverse diffusion process to predict less noisy precipitation maps over L time steps from noisy precipitation maps with missing regions while capturing spatio-temporal features. Encoded features from the 3D condition encoder are incorporated into each layer of the U-Net encoder and decoder. The condition inputs, including infrared imagery, latitude-longitude grids, and date information, provide additional guidance for inpainting.
For the experiments, we used hourly precipitation data from 2023 in the ERA5 dataset, provided by ECMWF, as the complete precipitation maps. Training data was generated by extracting observation masks from GSMaP and applying them to the ERA5 data. For infrared imagery, we used GPM Merged IR data. The trained model successfully inpainted missing regions in actual GSMaP data. We then evaluated whether the proposed method achieves more spatially coherent inpainting than conventional approaches.
To overcome these limitations, we propose a machine learning-based approach. Precipitation map inpainting can be formulated as video inpainting, a task in the research field of computer vision, in which missing regions are reconstructed using temporal information from adjacent frames and spatial cues from surrounding areas. Recent studies have framed video inpainting as a conditional generation task using diffusion models, a state-of-the-art generative approach. By training a conditional diffusion model with a 3D U-Net, which learns spatio-temporal features from paired incomplete and complete video samples, the diffusion model reconstructs fully inpainted videos from unseen data with missing regions. Furthermore, because the model is trained end to end, it eliminates the need for manually designing complex inpainting algorithms.
Our model consists of a 3D U-Net and a 3D condition encoder. The 3D U-Net learns the reverse diffusion process to predict less noisy precipitation maps over L time steps from noisy precipitation maps with missing regions while capturing spatio-temporal features. Encoded features from the 3D condition encoder are incorporated into each layer of the U-Net encoder and decoder. The condition inputs, including infrared imagery, latitude-longitude grids, and date information, provide additional guidance for inpainting.
For the experiments, we used hourly precipitation data from 2023 in the ERA5 dataset, provided by ECMWF, as the complete precipitation maps. Training data was generated by extracting observation masks from GSMaP and applying them to the ERA5 data. For infrared imagery, we used GPM Merged IR data. The trained model successfully inpainted missing regions in actual GSMaP data. We then evaluated whether the proposed method achieves more spatially coherent inpainting than conventional approaches.