*Ahyi KIM1, Yuji Nakamura1, Hiroki Uematsu1, Yohei Yukutake2, Yuki Abe3
(1.Yokohama City University, 2.Earthquake Research Institute, the University of Tokyo, 3.Hot Springs Research Institute, Kanagawa)
Keywords:Neural network, Machine Learning, Seismic Phase Picking
Kim et al. (2021, SSJ) have developed a couple of models using machine learning to pick seismic phases of the earthquakes occurring at Hakone volcano more accurately. In their study, they compared three models, namely model0, a trained model using PhaseNet developed by Zhu and Beroza (2018), and model1, a model trained from scratch using PhaseNet architecture (model1) and model2, a model fine-tuned using model0 as the initial model. To train the model1 and model2, they used about 220,000 seismic waveform data from about 30,000 events occurred in Hakone volcano between 1999 and 2020. As the results, they found that model1 and model2 have better performance than model0. In addition, the detection rate of the continuous data of swarm earthquakes, which were not used for training or verification, was greatly improved in model1 and model2. However, when there are multiple seismic waves in the same time window, model1 and model2 missed large amplitudes although they increase the probability of P- and S-wave slightly for small amplitudes that were not recognized at all in model 0. In this study, we conducted two experiments to overcome this problem and improve the performance of the model. Firstly, we changed the length of the time window for phase pick. Here, we used the data from May 1 to 20, 2019, the period of severe swarm earthquakes used in Kim et al. (2021, SSJ). We divided the continuous waveform into 1 hour, 30 minutes, 10 minutes, 1 minute, and 3 seconds. For the model1, we have detected 461, 509, 691, 875, 1028 events using the phase association method by Zhang et al. (2019). As shown the results, the detection rate was largely improved by increasing the time window. However, when multiple earthquakes occurred in a short time interval, the same tendency as the above issue was observed. Next, out of the 55,000 waveforms used as validation data by Kim et al. (2021, SSJ), we took out 25,000 waveforms that contained multiple waveforms, re-labeled 10,000 of them with multiple measurements, fine-tuned model1, and verified the results with the remaining validation data. As a result, we were able to improve the probability of detecting large and small amplitudes in the same time window, which had been missed in the past. In the future, we will increase the number of data including multiple earthquakes and fit them to the continuous data to verify the optimal training method, proportion of training data and the length of the time window.