3:30 PM - 3:45 PM
[SSS11-01] Predicting time series of real-time seismic intensity using non-linear graph-based dimensionality reduction
Keywords:Real-time seismic intensity, Time-series prediction, Non-linear graph-based dimensionality reduction
Numerical simulations of seismic wave propagation have been widely used to predict seismic waveforms and time series of ground motion index, although the high computational cost has been a major challenge. With recent advances in machine learning, this prediction problem has been approached using machine learning techniques. In this study, we propose a prediction method using a nonlinear dimensionality reduction technique The target for prediction is the real-time seismic intensity, an index equivalent to seismic intensity per second (Kunugi et al., 2008, 2013). Our previous study (Kubo & Miyamoto, 2024, Proceedings of the Annual Conference of JSAI) found that when the observed maximum value was large, the prediction tended to be significantly underestimated probably because the previous study simultaneously predicted both the time-series shape and the maximum value. To address this issue, we adopted an approach in which the time-series shape and the maximum value are predicted separately.
In this study, we focus on predicting the real-time seismic intensity time series at a single observation station based on source information. To predict the time-series shape, we followed the method of the previous study (Kubo & Miyamoto, 2024) that used the graph-based nonlinear dimensionality reduction technique UMAP (McInnes et al., 2018) with random forest regression. UMAP was used to reduce the dimension of the time-series data into a 2D map, and random forest regression was used to estimate the location of the 2D map from the input source information. UMAP is a nonlinear dimensionality reduction technique based on manifold learning and topological data analysis. The time series data was normalized using -3.5 and each maximum value before dimensionality reduction. For predicting the maximum value, we adopted a hybrid approach combining the ground motion prediction equation of Morikawa & Fujiwara (2013) and Gaussian process regression (Kubo et al., 2020; Kubo & Miyamoto, 2023). From the input source information (source distance, depth, moment magnitude Mw, latitude, and longitude), the time-series shape and the maximum value of real-time seismic intensity were predicted separately. The predicted maximum value was then used to denormalize the predicted time series, yielding the final real-time seismic intensity prediction.
We applied this prediction method to actual records. The prediction was conducted for the K-NET Tsukuba (IBR011) station of NIED K-NET. Records from 1997 to 2015 were used as training data, while records from 2016 to 2021 were used as test data. The real-time seismic intensity was computed from 100 Hz sampled acceleration waveforms at a sampling rate of 1 Hz. After applying peak-hold processing, a time window of up to 120 seconds was extracted, starting 5 seconds before the theoretical P-wave arrival time. Records meeting any of the following conditions were excluded: (1) those without data available from 5 seconds before the theoretical P-wave arrival, (2) those where the real-time seismic intensity at the theoretical P-wave arrival time was greater than zero, and (3) those where the maximum real-time seismic intensity was less than 1. As a result, the training dataset consisted of 153 records, and the test dataset contained 59 records. The source information was obtained from the NIED F-net moment tensor catalog.
For the February 13, 2021, off-Fukushima earthquake (Mw 7.1, depth 53 km), where previous studies exhibited significant underestimation, our proposed method predicted a maximum value closer to the observed data. Overall, our method showed a tendency to reduce underestimation for large-amplitude records. However, underestimation was not completely resolved. Additionally, in some cases, there were discrepancies between the predicted and observed time series shapes. The limited amount of training data likely affected the results. A hybrid approach that integrates the empirical formula-based method for real-time seismic intensity time-series prediction (Kubo & Kunugi, 2022) is considered a promising improvement.
In this study, we focus on predicting the real-time seismic intensity time series at a single observation station based on source information. To predict the time-series shape, we followed the method of the previous study (Kubo & Miyamoto, 2024) that used the graph-based nonlinear dimensionality reduction technique UMAP (McInnes et al., 2018) with random forest regression. UMAP was used to reduce the dimension of the time-series data into a 2D map, and random forest regression was used to estimate the location of the 2D map from the input source information. UMAP is a nonlinear dimensionality reduction technique based on manifold learning and topological data analysis. The time series data was normalized using -3.5 and each maximum value before dimensionality reduction. For predicting the maximum value, we adopted a hybrid approach combining the ground motion prediction equation of Morikawa & Fujiwara (2013) and Gaussian process regression (Kubo et al., 2020; Kubo & Miyamoto, 2023). From the input source information (source distance, depth, moment magnitude Mw, latitude, and longitude), the time-series shape and the maximum value of real-time seismic intensity were predicted separately. The predicted maximum value was then used to denormalize the predicted time series, yielding the final real-time seismic intensity prediction.
We applied this prediction method to actual records. The prediction was conducted for the K-NET Tsukuba (IBR011) station of NIED K-NET. Records from 1997 to 2015 were used as training data, while records from 2016 to 2021 were used as test data. The real-time seismic intensity was computed from 100 Hz sampled acceleration waveforms at a sampling rate of 1 Hz. After applying peak-hold processing, a time window of up to 120 seconds was extracted, starting 5 seconds before the theoretical P-wave arrival time. Records meeting any of the following conditions were excluded: (1) those without data available from 5 seconds before the theoretical P-wave arrival, (2) those where the real-time seismic intensity at the theoretical P-wave arrival time was greater than zero, and (3) those where the maximum real-time seismic intensity was less than 1. As a result, the training dataset consisted of 153 records, and the test dataset contained 59 records. The source information was obtained from the NIED F-net moment tensor catalog.
For the February 13, 2021, off-Fukushima earthquake (Mw 7.1, depth 53 km), where previous studies exhibited significant underestimation, our proposed method predicted a maximum value closer to the observed data. Overall, our method showed a tendency to reduce underestimation for large-amplitude records. However, underestimation was not completely resolved. Additionally, in some cases, there were discrepancies between the predicted and observed time series shapes. The limited amount of training data likely affected the results. A hybrid approach that integrates the empirical formula-based method for real-time seismic intensity time-series prediction (Kubo & Kunugi, 2022) is considered a promising improvement.