Automatic Generation of Earthquake Cycle Simulation and Data Assimilation Codes Using Large Language Models

Masayuki Kano; Kazuro Hirahara; Nobuki Kame; Tomohisa Okazaki

5:15 PM - 7:15 PM

[SCG60-P07] Automatic Generation of Earthquake Cycle Simulation and Data Assimilation Codes Using Large Language Models

*Masayuki Kano¹, Kazuro Hirahara^2,3, Nobuki Kame⁴, Tomohisa Okazaki² (1.Graduate school of science, Tohoku University, 2.RIKEN, 3.Kagawa University, 4.ERI, Univesity of Tokyo)

Keywords:Large Language Models, Earthquake Cycle Simulation, Data Assimilation, Slow Slip Event

Recent innovations in Large Language Models (LLMs), such as ChatGPT, have been remarkable and are increasingly being adopted in the fields of research and education. However, their direct application to research activities itself remains limited, and their full potential is still being explored. In this presentation, we report on an investigation into the feasibility of using LLMs to automatically generate simulation codes for research purposes, which was conducted during the “Hackathon on Large Language Models and Earthquake Research” (Kubo, Wu, Kano, Kato, et al., 2024 SSJ meeting) held in August 2024.

Specifically, we focused on the following three topics, aiming to create Python codes for earthquake cycle simulations and related data assimilation tasks:
1. Generation of earthquake cycle simulation code using the spring-slider model
2. Generation of data assimilation code with a particle filter based on the Lorenz 63 model
3. Generation of a 4D variational data assimilation code for estimating friction parameters in slow slip regions.
For these tasks, ChatGPT-4o was employed for Topics 1 and 2, while Claude 3.5 Sonnet was used for Topic 3 (As of August 2024). In each topic, the final objectives were achieved. However, the process revealed that simply executing the LLM output often did not work, requiring significant trial and error to modify the formulas. Despite these challenges, a significant portion of the coding could be generated automatically using LLM. We believe this demonstrates their potential for improving the efficiency of numerical code development for research purposes. In this presentation, we will share the detailed processes and challenges encountered before achieving successful outcomes.

Presentation information

[S-CG60] Driving Solid Earth Science through Machine Learning

[SCG60-P07] Automatic Generation of Earthquake Cycle Simulation and Data Assimilation Codes Using Large Language Models