Detecting amplitude anomaly of seismograms with probabilistic model of spectral descriptors

Satoru Fukayama; Takahiko Uchide; Haruo Horikawa; Takahiro Shiina; Hiroki Kuroda; Jun Ogata

11:30 AM - 11:45 AM

[SCG51-09] Detecting amplitude anomaly of seismograms with probabilistic model of spectral descriptors

*Satoru Fukayama¹, Takahiko Uchide¹, Haruo Horikawa¹, Takahiro Shiina¹, Hiroki Kuroda², Jun Ogata¹ (1.National Institute of Advanced Industrial Science and Technology (AIST), 2.Ritsumeikan University)

Keywords:Seismometer, Anomaly Detection, Machine Learning

We propose a method for detecting amplitude anomalies of seismometers from a single event at a single station. Detecting amplitude anomalies enables faster recovery of failed seismometers, whereas amplitude anomalies are easily identified by observing multiple events over a long period. Our method detects anomalies by training a probabilistic model of features representing the spectral shape and achieves a detection performance of AUC=0.93 (F1=0.905) using 10-dimensional features for each axis (NS, EW, UD) of the velocity waveform.

We define the amplitude anomaly of a seismometer when the amplitudes of seismograms are significantly different from the expected amplitudes (e.g., more than ten times or less than 0.1 times). Existing studies of seismometer anomaly detection include research using the amplitude ratio and cross-correlation coefficients of co-located broadband and strong-motion seismometers (Li et al., 2019), and research that utilized feature selection and outlier detection to achieve the detection performance of F1=0.9 or better using 0.2 Hz value of the amplitude spectrum (Zaccarelli et al., 2021). We show that our approach can also achieve comparable performance (F1=0.905) as existing studies.

Since an amplitude anomaly changes the shape of the amplitude spectrum, a probabilistic model of the spectral shape can be used to detect amplitude anomalies. Therefore, we conducted experiments to verify the usefulness of the probabilistic model that uses spectral descriptors, which are indicators of the shape of the amplitude spectrum. After applying a Hamming window to the velocity waveform (300 seconds) on each axis of the seismic event, our method calculated the amplitude spectrum and the spectral descriptors within 0.02-10.0 Hz bandwidth. The spectral descriptors are the 10-dimensional feature including Spectral Centroid, Spectral Spread, Spectral Skewness, Spectral Kurtosis, Spectral Entropy, Spectral Flatness, Spectral Crest, Spectral Slope, Spectral Decrease, and Spectral Rolloff Point (85%).

We have compared the performance of anomaly detection using spectral descriptors and the log-filterbank output (10-150 dims) of spectra which are the less manually designed features. We used 76357 events of small earthquakes with epicenter in northern Ibaraki Prefecture from July 1, 2016, to August 31, 2021, from the temporary seismometer network deployed in northern Ibaraki Prefecture by the National Institute of Industrial Science and Technology (AIST). We visually checked the ratio of the maximum amplitudes of the NS/UD and EW/UD after subtracting the mean during the event and selected 12550 amplitude anomaly data points.

The figure shows the performance comparison between Gaussian Mixture Model (GMM) and Variational Autoencoder (VAE). The vertical axis is the Area Under the Curve (AUC) (value range: [0,1]) based on the receiver's operating characteristics, which indicates the anomaly detection performance. The horizontal axis is dimensionality of the feature. In the legend, SPD represents the spectral features, LFB represents the log-filterbank output, n in the left figure represents the number of mixtures in GMM, and z_dim in the right figure represents the dimensionality of the latent variables of VAE. When using GMM (left figure), the performance of the spectral features is better than the performance of the log-filterbank output regardless of the number of mixtures (n). Using the log-filterbank outputs gives the maximum performance at 40-60 dimensions and worsens as the dimensionality of feature increases. When using VAE (right figure), the performance using log-filterbank output improves as the dimensionality of feature increases, and when the dimensionality of feature exceeds 90, the performance is superior or comparable to the one with VAE and the spectral descriptors. The highest performance (AUC = 0.93, F1 = 0.905) is achieved when using GMM and spectral descriptors.

The spectral descriptors used in this study were found to be useful for detection of amplitude anomaly. Furthermore, the low dimensionality of the spectral descriptors (10 dims for each axis) enabled the GMM to achieve comparable performance as existing studies. When using higher-dimensional features, it is suggested to use VAE to avoid degraded performance caused by the increase in the dimensionality of the features. The detection performance (AUC=0.93) is considered sufficient for detecting amplitude anomalies from the velocity waveform of a single event.

[Acknowledgement] This study was supported by MEXT Project for Seismology toward Research Innovation with Data of Earthquake (STAR-E) Grant Number JPJ010217.

Presentation information

[S-CG51] Driving Solid Earth Science through Machine Learning

[SCG51-09] Detecting amplitude anomaly of seismograms with probabilistic model of spectral descriptors