Japan Geoscience Union Meeting 2021

Presentation information

[J] Poster

S (Solid Earth Sciences ) » S-TT Technology & Techniques

[S-TT37] Seismic Big Data Analysis Based on the State-of-the-Art of Bayesian Statistics

Thu. Jun 3, 2021 5:15 PM - 6:30 PM Ch.14

convener:Hiromichi Nagao(Earthquake Research Institute, The University of Tokyo), Aitaro Kato(Earthquake Research Institute, the University of Tokyo), Keisuke Yano(The Institute of Statistical Mathematics), Takahiro Shiina(National Institute of Advanced Industrial Science and Technology)

5:15 PM - 6:30 PM

[STT37-P01] Clustering seismic noises using semi-supervised learning

*Kotaro Sato1, Keisuke Yano2, Komaki Fumiyasu1 (1.The University of Tokyo, 2.The Institute of Statistical Mathematics)


Keywords:seismic noise, semi-supervised learning, classification

Seismic records contain various kinds of seismic activity (such as normal and weak tectonic tremors) and noise signals. An example of noise signals is ambient noise signals caused by atmospheric pressure fluctuation[1], temperature change[2], waves and wind[3]. Another example is noise signals due to human activities such as automobiles, trains, air-planes, and wind power generation[3]. Appropriate classification of a wide variety of seismic activities and noises leads to not only the improvement of seismic detection accuracy but also the understanding of surrounding ground and site characteristics[4].

Noise is usually classified by unsupervised learning because it is difficult to label noise signals. Johnson et al.[5] used the K-means method to cluster the noise signals after removing obvious earthquake signals. Complete earthquake detection is difficult so noise signals contain earthquake signals that have been manually missed [6]. However, we found that regular earthquake signals are uniformly distributed when we apply the model trained by Johnson et al. ’s method to regular earthquake signals.

In this study, we propose a semi-supervised learning method that uses known seismic records as labeled data to appropriately cluster other weak noise waveforms while organizing labeled seismic data into a cluster. We first employ t-Distributed Stochastic Neighbor Embedding(t-SNE) to reduce the dimensionality. We next use a semi-supervised mixed Gaussian model and perform classification using its likelihood.

The accuracy of the proposed method is verified using the vertical acceleration waveform records acquired by the seismograph array of the Metropolitan Area Seismic Observing Network (MeSO-net). As a result, we confirmed that the seismic waveforms are clustered more tightly in terms of the log-likelihood. The characteristics of the frequency and amplitude of the data belonging to each cluster will be reported.

References
[1] Lei Qin, Frank L. Vernon, Christopher W. Johnson, and Yehuda Ben-Zion. Spectral characteristics of daily to seasonal ground motion at the Pi˜non Flats Observatory from coherence of seismic data. Bulletin of the Seismological Society of America, Vol. 109, No. 5, pp. 1948–1967, 2019.
[2] Gregor Hillers and Y. Ben-Zion. Seasonal variations of observed noise amplitudes at 2–18 hz in southern California. Geophysical Journal International, Vol. 184, No. 2, pp. 860–868, 2011.
[3] Kawakita, Y., Sakai, S., Various Types of Noise in MeSO-net. Earthquake Res. Inst. Lec., Vol. 84, No. 2, pp. 127–139, 2009.
[4] Qingkai Kong, Daniel T. Trugman, Zachary E. Ross, Michael J. Bianco, Brendan J. Meade, and Peter Gerstoft. Machine learning in seismology: Turning data into insights. Seismological Research Letters, Vol. 90, No. 1, pp. 3–14, 2019.
[5] Christopher W. Johnson, Yehuda Ben-Zion, Haoran Meng, and Frank Vernon. Identifying different classes of seismic noise signals using unsupervised learning. Geophysical Research Letters, Vol. 47, No. 15, p. e2020GL088353, 2020.
[6] Zachary E Ross, Daniel T Trugman, Egill Hauksson, and Peter M Shearer. Searching for hidden earthquakes in southern california. Science, Vol. 364, No. 6442, pp. 767–771, 2019.