11:00 AM - 1:00 PM
[SSS10-P07] Extraction of commonalities in source characteristics by unsupervised learning for source parameters in and around Japan
Keywords:Unsupervised Learning, Source Characteristic, Strong Motion, Database
1. Introduction
Conventional earthquake ground motion prediction equations have been proposed based on empirical findings obtained mainly from large earthquakes throughout Japan, categorizing those into the fault types (normal, reverse, and strike-slip faults) and earthquake types (crustal, inter-plate, and intra-slab earthquakes) (e.g., Satoh(2010), Morikawa and Fujiwara(2013)).
However, the original various source parameters themselves may have interrelationships and regional characteristics that were previously unnoticed or difficult to incorporate into the ground motion prediction equation. If these source characteristics can be reflected in earthquake ground motion prediction, it may be possible to reduce the variability of prediction results.
In this study, we conducted a data-driven analysis of multidimensional data on source parameters with the aim of extracting new common source characteristics.
2. Analysis of Dimensionally Reduced Source Parameters
In order to clarify the feature of source parameters, we investigated 17426 earthquakes occurred from January 1997 to February 2018, whose source mechanisms were obtained by F-net, stored in the prototype strong-motion unified database (Morikawa et al., 2020). The source parameters are represented by 13-element (latitude and longitude of the epicenter, focal depth, moment magnitude, strike, dip, rake, and moment tensor (Mxx, Mxy, Mxz, Myy, Myz, Mzz)). Note that the strike and rake angles are represented by a combination of sine and cosine, respectively, thus, the source parameters are represented by 15-element vectors.
To analyze and visualize the source parameters, we standardize these elements individually for feature scaling, then we reduced the dimensionality of 15-element vector into 2-dimensinal vector by performing t-SNE after PCA. This is because low dimensional data can easily be plotted on 2D plot (called 2D-map) and be easy to understand for us. Finally, we analyzed the source characteristics suggested by the data on dimensionally reduced 2D-map.
We considered that the location of the epicenter could be a major cause of clustered distribution in 2D-map for each region, and therefore we considered the cases where latitudes and longitudes of the epicenters were included and excluded.
3. Results
The results of the evaluation when latitude and longitude are excluded from the training data are shown in Fig. 1 (2D-map). The source mechanisms of earthquakes corresponding to each zone illustrated by the 2D-map are shown in Fig. 2. In the lower part of the 2D-map, inter-plate earthquakes along the Chishima Trench and Japan Trench are clustered in the zone C. Around the left and right sides of the 2D-map, inland crustal earthquakes of strike-slip faults and relatively shallow subduction-zone earthquakes of strike-slip faults are distributed. Both of zones A and H include many inland crustal earthquakes of strike-slip faults, but they are located roughly symmetrically far from the vertical axis of the 2D-map. This seems to be due to the difference in the strike of the conjugate fault plane. The inland crustal earthquakes of mainly reverse fault are distributed in the zone G between the zone C of inter-plate reverse fault earthquakes and the zone H of inland crustal strike-slip fault earthquakes. Many subduction-zone earthquakes of normal faults are distributed in the upper part of the 2D-map, with north-south strike earthquakes slightly to the right and east-west strike earthquakes slightly to the left. Zone B also contains earthquakes that may have occurred in the lower part of the double seismic zone inside the subducting Pacific Plate, as well as outer-rise earthquakes. Zone E is far to the right of the others. The earthquakes in zone E are all deep earthquakes (more than several hundred km deep), and are considered to be detected as anomalies. Zone F, which located in the upper left of the 2D-map, seems to be composed of mainly extremely low-angle earthquakes of reverse faults.
We show that unsupervised learning of source parameters can be used to classify earthquakes into distinctive groups that reflect not only the earthquake type and the fault type, but also the plate tectonics around Japan. In the future, we will try to improve the explanation of our results by feature engineering, adding features, reducing features, and clustering analysis.
Conventional earthquake ground motion prediction equations have been proposed based on empirical findings obtained mainly from large earthquakes throughout Japan, categorizing those into the fault types (normal, reverse, and strike-slip faults) and earthquake types (crustal, inter-plate, and intra-slab earthquakes) (e.g., Satoh(2010), Morikawa and Fujiwara(2013)).
However, the original various source parameters themselves may have interrelationships and regional characteristics that were previously unnoticed or difficult to incorporate into the ground motion prediction equation. If these source characteristics can be reflected in earthquake ground motion prediction, it may be possible to reduce the variability of prediction results.
In this study, we conducted a data-driven analysis of multidimensional data on source parameters with the aim of extracting new common source characteristics.
2. Analysis of Dimensionally Reduced Source Parameters
In order to clarify the feature of source parameters, we investigated 17426 earthquakes occurred from January 1997 to February 2018, whose source mechanisms were obtained by F-net, stored in the prototype strong-motion unified database (Morikawa et al., 2020). The source parameters are represented by 13-element (latitude and longitude of the epicenter, focal depth, moment magnitude, strike, dip, rake, and moment tensor (Mxx, Mxy, Mxz, Myy, Myz, Mzz)). Note that the strike and rake angles are represented by a combination of sine and cosine, respectively, thus, the source parameters are represented by 15-element vectors.
To analyze and visualize the source parameters, we standardize these elements individually for feature scaling, then we reduced the dimensionality of 15-element vector into 2-dimensinal vector by performing t-SNE after PCA. This is because low dimensional data can easily be plotted on 2D plot (called 2D-map) and be easy to understand for us. Finally, we analyzed the source characteristics suggested by the data on dimensionally reduced 2D-map.
We considered that the location of the epicenter could be a major cause of clustered distribution in 2D-map for each region, and therefore we considered the cases where latitudes and longitudes of the epicenters were included and excluded.
3. Results
The results of the evaluation when latitude and longitude are excluded from the training data are shown in Fig. 1 (2D-map). The source mechanisms of earthquakes corresponding to each zone illustrated by the 2D-map are shown in Fig. 2. In the lower part of the 2D-map, inter-plate earthquakes along the Chishima Trench and Japan Trench are clustered in the zone C. Around the left and right sides of the 2D-map, inland crustal earthquakes of strike-slip faults and relatively shallow subduction-zone earthquakes of strike-slip faults are distributed. Both of zones A and H include many inland crustal earthquakes of strike-slip faults, but they are located roughly symmetrically far from the vertical axis of the 2D-map. This seems to be due to the difference in the strike of the conjugate fault plane. The inland crustal earthquakes of mainly reverse fault are distributed in the zone G between the zone C of inter-plate reverse fault earthquakes and the zone H of inland crustal strike-slip fault earthquakes. Many subduction-zone earthquakes of normal faults are distributed in the upper part of the 2D-map, with north-south strike earthquakes slightly to the right and east-west strike earthquakes slightly to the left. Zone B also contains earthquakes that may have occurred in the lower part of the double seismic zone inside the subducting Pacific Plate, as well as outer-rise earthquakes. Zone E is far to the right of the others. The earthquakes in zone E are all deep earthquakes (more than several hundred km deep), and are considered to be detected as anomalies. Zone F, which located in the upper left of the 2D-map, seems to be composed of mainly extremely low-angle earthquakes of reverse faults.
We show that unsupervised learning of source parameters can be used to classify earthquakes into distinctive groups that reflect not only the earthquake type and the fault type, but also the plate tectonics around Japan. In the future, we will try to improve the explanation of our results by feature engineering, adding features, reducing features, and clustering analysis.