11:15 〜 11:30
[SCG44-20] Development of a multi-GPU and multi-node numerical code for a large-scale simulation of slow-to-fast earthquakes
キーワード:スロー地震、数値シミュレーション、GPGPU
Slow slip events (SSEs) can be numerically simulated, combining a constitutive law within a shear zone and stress interaction in elastic medium. A rate- and state-dependent friction law (RS-law) is sometimes used to reproduce SSEs as the constitutive law, assuming that the deformation of the shear zone is slip at an interface, which is considered as a boundary of elastic medium. The boundary element method (BEM) is an effective method to calculate the spatio-temporal evolution of the slip of SSEs on a plate interface (e.g., Matsuzawa et al., 2010, 2013). As the elastic response is linear, stress change rate at each element can be calculated by the product of elastostatic kernel matrix (N x N; N is number of elements) and slip deficit rate vector (N), in the SSE simulations based on the BEM. Due to this calculation, computational cost in the BEM is usually smaller than the calculation of full 3D elastic medium as in the finite element method. However, it takes more than 10 days to calculate a relatively large model (N~170,000) in the BEM, even using 256 nodes in the large computer system in National Research Institute for Earth Science and Disaster Resilience (NIED).
Slow earthquakes obey a scaling law in spatial and temporal sizes (Ide et al., 2007), as also found in fast earthquakes (i.e., regular earthquakes). In addition, understanding of the slow to fast transition is also a challenging theme remaining in these two multi-scale phonomena. However, several millions of elements seem to be required at least, to capture the multiscale phenomena in 2 or 3 orders in the spatial scale. This means that faster numerical calculation is necessary to make such multiscale simulations within realistic time.
I develop a numerical code to simulate slow earthquakes in a multi GPU and multi node environment, to overcome the problem in the calculation time. NVIDIA Fortran (previously known as PGI Fortran) is used to utilize GPUs in addition to CPUs, as the NIED computer system has four NVIDIA TESLA V100 on each node. One of the major bottlenecks of the high-speed computing is the evaluation of the product of elastostatic kernel matrix and a vector. To implement this calculation, I simply use DGEMV in the BLAS library for CPUs, and the CUBLAS library for GPUs. I note that a hybrid calculation with GPUs and CPUs is adopted to use totally wider memory band-width in the node, as the performance of DGEMV is often limited by the memory band-width. In this simulation, I adopt an RS-law with cutoff velocities. Temporal evolution of slip velocity is numerically simulated, introducing elastic response of semi-infinite medium and realistic configuration of the plate interface. The plate interface is expressed by small triangular elements.
To validate our developed code, I compared the results after the calculation of 20,000 steps in the medium-sized model (N~93,000). This is similar to the Shikoku model in Matsuzawa et al. (2013). Relative error of the result from the newly developed code and the CPU-only code is within 10-10. The numerical result is well replicated by the new code.
Then, I tested our numerical code in a relatively large-scale model (N~170,000). This covers the Nankai and Hyuganada region (e.g., Matsuzawa and Shibazaki, 2020). I use 16 GPU-CPU nodes in the NIED computer. The calculation is 1.5 times faster than the case of 256 CPU-only nodes. This means that my new code is about 24 times faster per node than the previous code only with CPUs.
Slow earthquakes obey a scaling law in spatial and temporal sizes (Ide et al., 2007), as also found in fast earthquakes (i.e., regular earthquakes). In addition, understanding of the slow to fast transition is also a challenging theme remaining in these two multi-scale phonomena. However, several millions of elements seem to be required at least, to capture the multiscale phenomena in 2 or 3 orders in the spatial scale. This means that faster numerical calculation is necessary to make such multiscale simulations within realistic time.
I develop a numerical code to simulate slow earthquakes in a multi GPU and multi node environment, to overcome the problem in the calculation time. NVIDIA Fortran (previously known as PGI Fortran) is used to utilize GPUs in addition to CPUs, as the NIED computer system has four NVIDIA TESLA V100 on each node. One of the major bottlenecks of the high-speed computing is the evaluation of the product of elastostatic kernel matrix and a vector. To implement this calculation, I simply use DGEMV in the BLAS library for CPUs, and the CUBLAS library for GPUs. I note that a hybrid calculation with GPUs and CPUs is adopted to use totally wider memory band-width in the node, as the performance of DGEMV is often limited by the memory band-width. In this simulation, I adopt an RS-law with cutoff velocities. Temporal evolution of slip velocity is numerically simulated, introducing elastic response of semi-infinite medium and realistic configuration of the plate interface. The plate interface is expressed by small triangular elements.
To validate our developed code, I compared the results after the calculation of 20,000 steps in the medium-sized model (N~93,000). This is similar to the Shikoku model in Matsuzawa et al. (2013). Relative error of the result from the newly developed code and the CPU-only code is within 10-10. The numerical result is well replicated by the new code.
Then, I tested our numerical code in a relatively large-scale model (N~170,000). This covers the Nankai and Hyuganada region (e.g., Matsuzawa and Shibazaki, 2020). I use 16 GPU-CPU nodes in the NIED computer. The calculation is 1.5 times faster than the case of 256 CPU-only nodes. This means that my new code is about 24 times faster per node than the previous code only with CPUs.