日本地球惑星科学連合2022年大会

講演情報

[J] ポスター発表

セッション記号 S (固体地球科学) » S-TT 計測技術・研究手法

[S-TT41] ハイパフォーマンスコンピューティングが拓く固体地球科学の未来

2022年5月30日(月) 11:00 〜 13:00 オンラインポスターZoom会場 (25) (Ch.25)

コンビーナ:堀 高峰(独立行政法人海洋研究開発機構・地震津波海域観測研究開発センター)、コンビーナ:八木 勇治(国立大学法人 筑波大学大学院 生命環境系)、汐見 勝彦(国立研究開発法人防災科学技術研究所)、座長:堀 高峰(国立研究開発法人海洋研究開発機構)

11:00 〜 13:00

[STT41-P01] Optimization of the integrated tsunami analysis code JAGURS for Earth Simulator 4

*今井 健太郎1馬場 俊孝2、今任 嘉幸1、上原 均1、加藤 季広3堀 高峰1 (1.国立研究開発法人 海洋研究開発機構、2.徳島大学、3.NEC)

キーワード:津波、スーパーコンピュータ、数値計算

Japan Agency for Marine-Earth Science and Technology (JAMSTEC) has been developing an integrated tsunami analysis code, JAGURS. JAGURS code has various higher-order analysis functions that can be used to analyze nonlinear long wave theory used for tsunami hazard mapping, considers dispersion effects (Baba et al, 2021), seawater compressibility and elastic deformation of the earth's crust due to seawater loading (Baba et al., 2017), and submarine landslide tsunamis (Baba et al., 2019).
JAMSTEC has also been actively introducing high-performance computing hardware, and in FY2020 launched Earth Simulator 4 (ES4), a multi-architecture supercomputer based on AMD CPUs combined with NEC's Vector Engine and NVIDIA's A100 GPU. ES4 has a vector arithmetic performance ratio of about 4.8 times that of Earth Simulator 3 (ES3) per single core, and a theoretical computing performance of 19.5 PFLOPS for the entire system. Various scientific and technological codes have been optimized and utilized for the ES series, including JAGURS.
JAGURS is still being optimized and tuned for ES4, and is being used to understand large earthquakes using high-precision numerical tsunami analysis (e.g., Kusumoto et al., 2021).
In this presentation, we report the effect of optimization tuning for ES4 of JAGURS and the performance evolution from ES3.
Nonlinear long wave theory (Case 1) and nonlinear dispersive wave theory (Case 2) were selected as model case to be optimized for JAGURS code. For these codes, basic performance information was collected in ES3, and these codes were transplanted to ES4 system, and computations were run under similar boundary conditions to confirm stable operation. As a result, the computational performance ratio compared to ES3 was over 2x in Case 1 and less than 2x in Case 2. In addition, we collected information on the computational load of each subroutine to find it with the highest load. The subroutines with high computational load were then optimized for ES4. This optimization resulted in the computing performance ratio of about 3x in Case 1, suggesting that ES4 is effective in improving computing performance. On the other hand, in Case 2, the performance ratio was remained less than 2x, suggesting that the memory transfer performance had a significant impact on the computing performance improvement.