5:15 PM - 6:30 PM
[AAS07-P04] System-Application Co-design for Supercomputer Fugaku and Global Ensemble Weather Data Assimilation
Keywords:High Performance Computing, Data Assimilation, Global Nonhydrostatic Atmospheric Model, NICAM, LETKF
The supercomputer Fugaku, Japan's new flagship machine, won the 2020's international benchmark rankings in four categories. This system was developed based on a "system-application co-design" aiming for high performance in real-world scientific computing software. The Nonhydrostatic ICosahedral Atmospheric Model (NICAM) and the Local Ensemble Transform Kalman Filter (LETKF) were chosen as the target applications from the weather and climate science domain. We have improved the computational performance of the NICAM-LETKF data assimilation system to achieve x100 faster computation than the K computer.
Weather/climate models are data-intensive applications, which means that we have to conduct the simulation by transferring data. The "data-centric" design is an essential approach for performance improvement, which is applicable not only to our software but also to other models on other supercomputers. We enhanced the use of lesser precision floating-point arithmetic and developed a performance evaluation method to efficiently find the time-consuming part by non-computational operations. Furthermore, we have made improvements to optimize the data transfer between the simulation and the data assimilation.
Based on the co-design results, we realized a global 3.5 km mesh, 1024-member ensemble data assimilation with 131,705 nodes of Fugaku. In this ground-breaking experiment, 1.3 PB of data was transferred from the simulation to the ensemble data assimilation system. And we showed that about four hours are required to complete one assimilation cycle.
Weather/climate models are data-intensive applications, which means that we have to conduct the simulation by transferring data. The "data-centric" design is an essential approach for performance improvement, which is applicable not only to our software but also to other models on other supercomputers. We enhanced the use of lesser precision floating-point arithmetic and developed a performance evaluation method to efficiently find the time-consuming part by non-computational operations. Furthermore, we have made improvements to optimize the data transfer between the simulation and the data assimilation.
Based on the co-design results, we realized a global 3.5 km mesh, 1024-member ensemble data assimilation with 131,705 nodes of Fugaku. In this ground-breaking experiment, 1.3 PB of data was transferred from the simulation to the ensemble data assimilation system. And we showed that about four hours are required to complete one assimilation cycle.