Japan Geoscience Union Meeting 2025

Presentation information

[J] Oral

S (Solid Earth Sciences ) » S-TT Technology & Techniques

[S-TT43] Seismic Big Data Analysis Based on the State-of-the-Art of Bayesian Statistics

Mon. May 26, 2025 10:45 AM - 12:15 PM 201A (International Conference Hall, Makuhari Messe)

convener:Hiromichi Nagao(Earthquake Research Institute, The University of Tokyo), Aitaro Kato(Earthquake Research Institute, the University of Tokyo), Keisuke Yano(The Institute of Statistical Mathematics), Takahiro Shiina(National Institute of Advanced Industrial Science and Technology), Chairperson:Hiromichi Nagao(Earthquake Research Institute, The University of Tokyo), Aitaro Kato(Earthquake Research Institute, the University of Tokyo), Keisuke Yano(The Institute of Statistical Mathematics), Takahiro Shiina(National Institute of Advanced Industrial Science and Technology)

11:30 AM - 11:45 AM

[STT43-04] Ambient Noise Missing Data Prediction Using Long Short Time Memory (LSTM) in the United Arab Emirates (UAE)

*Intan Andriani Putri1, Mohammed Ali1, Fateh Bouchaala1, Jun Matsushima2 (1.KUST, 2.UOT)

Keywords:Seismology, Ambient Noise, Machine Learning, Long Short-Term Memory (LSTM)

The modeling of S-wave velocity can be effectively conducted using Ambient Noise Tomography (ANT), a technique particularly well-suited for regions with low seismicity, such as the United Arab Emirates (UAE). Unlike conventional seismic methods that rely on earthquake signals, ANT utilizes ambient noise signals. Seismic recordings from two stations on the same day are cross-correlated to obtain the Empirical Green Function (EGF), which serves as the basis for modeling S-wave velocity. However, technical challenges such as recorder malfunctions or power loss—often due to insufficient solar energy at certain times—lead to data gaps in the recordings. Even a missing one-hour segment within a 24-hour dataset may render the entire day's recordings unusable, resulting in significant data loss. As a potential solution, missing data segments can be interpolated and/ or extrapolated to enable complete dataset utilization. However, interpolation and extrapolation often distorts the natural trend of the EGF. Recent advancements in machine learning have demonstrated its effectiveness in time-series data prediction. In this study, a supervised Long Short-Term Memory (LSTM) network was employed to reconstruct incomplete seismic recordings from the UAE. The dataset spans from 2014 to 2018, recorded by 33 seismic stations. A subset of complete 24-hour recordings was used for training the LSTM model, ensuring high validation accuracy before application to incomplete datasets. The proposed approach demonstrated strong agreement between actual and predicted seismograms in both training and validation datasets. The Mean Squared Error (MSE) of 0.81 with a determination coefficient (R²) of 0.78 was achieved for the training dataset, while the validation dataset yielded an MSE of 0.78 and R² of 0.71. This study underscores the potential of LSTM in reconstructing missing seismic data without incurring additional measurement costs. By leveraging the correlations between missing segments and surrounding data, the entire missing block is reconstructed through an iterative strategy. The proposed method provides an effective alternative for improving seismic data continuity, offering valuable applications in seismology.