[PEM15-09] Implementing neural network's inference into the magnetohydrodynamic simulation code
Keywords:MHD simulation, neural network, machine learning, shock tube problem
For understanding astrophysical phenomena, magnetohydrodynamic (MHD) simulations have been powerful research tools with increasing computational capability. CANS+ is a public MHD simulation code (Matsumoto et al., 2019) and provides a suite of solvers of nonlinear MHD equations. As other modern MHD simulation codes, CANS+ employs the HLLD approximate Riemann solver and the hyperbolic divergence cleaning method, but with the fifth-order accurate reconstruction technique (MP5 scheme) for providing numerical solutions with high accuracy and stability. Because the total computational cost of CANS+ is mostly due to the reconstruction and the numerical flux calculation, improving these numerical operations remains as a task in terms of computation speeds.
Recently, the machine learning has attracted much attention in different research fields. Especially, the deep learning (or neural network, DNN) has been applied to big data analysis in our daily life and become an important computational technology. In addition, the DNN is compatible with GPUs, and it is also a research subject of high-performance computing.
We introduce this algorithm as a new alternative to the most computationally expensive parts in CANS+. These expensive operations can be replaced by the DNN inference, which is further accelerated by use of GPUs. In this study, we deal with the 1D shock tube problem and generated training data by CANS+ simulations which are initialized by randomly selected conditions. The training data size is up to 300,000 sets.
The input data for the DNN are the primitive variables in six computatinal cells, and the true label data are the numerical flux. The dataset was standardized before training. We examined three network models to achieve high accuracy, and compared each accuracy and calculation time to find the best performance model.
Model 1 predicts the flux associated with all the primitive variables by one network. Model 2 and Model 3 predicts different pairs of the numerical flux according to the transverse and longitudinal components of the MHD equations. As a result, we found the three models can predict the numerical flux with 99.99% accuracy for the validation data which was not used in the training of the networks.
In this presentation, we detail the network structure and their prediction performance for the validation data as result of hyper parameter tunings. We also show performance of the CANS+ code implemented with these networks by examining 1D simulations.
Recently, the machine learning has attracted much attention in different research fields. Especially, the deep learning (or neural network, DNN) has been applied to big data analysis in our daily life and become an important computational technology. In addition, the DNN is compatible with GPUs, and it is also a research subject of high-performance computing.
We introduce this algorithm as a new alternative to the most computationally expensive parts in CANS+. These expensive operations can be replaced by the DNN inference, which is further accelerated by use of GPUs. In this study, we deal with the 1D shock tube problem and generated training data by CANS+ simulations which are initialized by randomly selected conditions. The training data size is up to 300,000 sets.
The input data for the DNN are the primitive variables in six computatinal cells, and the true label data are the numerical flux. The dataset was standardized before training. We examined three network models to achieve high accuracy, and compared each accuracy and calculation time to find the best performance model.
Model 1 predicts the flux associated with all the primitive variables by one network. Model 2 and Model 3 predicts different pairs of the numerical flux according to the transverse and longitudinal components of the MHD equations. As a result, we found the three models can predict the numerical flux with 99.99% accuracy for the validation data which was not used in the training of the networks.
In this presentation, we detail the network structure and their prediction performance for the validation data as result of hyper parameter tunings. We also show performance of the CANS+ code implemented with these networks by examining 1D simulations.