2:45 PM - 3:00 PM
[S14-04] Predicting Earthquakes with Hierarchical Neural Network Models
INTRODUCTION
Earthquakes continue to cause numerous deaths and financial losses throughout the years, and it is unlikely that the scientific community will find a solution to this problem in the near future. Forecasting earthquakes is our best bet to prevent the loss of lives, as it enables the evacuation of potentially affected areas in advance. Yet, this has proven remarkably difficult to achieve, despite many years of effort by the scientific community. With the sharp rise of machine learning and artificial intelligence over the past decade, it is increasingly important to leverage the power of such technologies to improve earthquake forecasting, and in this study, we investigate the use of neural networks (NNs) for such purpose.
BACKGROUND
There are many attempts to use NNs in this field [1,4,3], but with a tendency to repeatedly add additional layers of neurons. The theory says that this makes it increasingly difficult for the NNs to learn, due to the increase on its Vapnik-Chervonenkis (VC) dimension [7]. We thus argue that a careful selection of features and a parsimonious choice of NN architecture can yield more reliable results, easier interpretation of the importance of each feature, and an overall better understanding of the resulting forecasting model obtained after training the network.
METHODS
In this study, we focus on predicting the maximum magnitude on each day, following recent work [5,6]. We then report the results of our prediction framework, which is founded in two main novel ideas (Fig. 1):
1. calculation of seismicity indicator features over multiple
lengths of time-windows; and
2. separating the first layer of the NN in subnetworks, each
dedicated to processing one particular time-window length.
The seismicity indicators that we calculate over a time-window
of length N days are:
- T-value: the time spanned by all earthquakes above a certain
threshold.
- Mean magnitude.
- Rate of the squared root of seismic energy released.
- Time since the first earthquake above a certain magnitude threshold k.
- Slope b and intercept a of the Gutenberg–Richter (GR) law
fitted to the events in the time-window.
- Sum of square errors of the magnitudes to the ideal line of
the GR law.
- Magnitude deficit: the difference between the expected maximum magnitude and the observed maximum magnitude.
- Coefficient of variation of inter-event times after removing
earthquakes with magnitude below a certain threshold k.
To make a prediction, we calculate the seismicity indicators
above for time-windows of size 7, 15, 30, 60, 90, and 180 days
(see Fig. 1). A naive approach would feed them all to a fully
connected multi-layer NN. This forces the network to look at all
the features simultaneously, whereas we believe it is more fruitful to let it look at the features of each time window separately.
In our hierarchical model, features of the time-window of
length N are fed to their own exclusive NN, and only then the
output of these subnetworks is fed to a fully-connected layer of
simple neuron units, and a final layer yields a prediction for the
earthquake magnitude in the following day.
RESULTS
We have applied the proposed model to earthquake catalogs of
New Zealand, Japan (Fig. 1) and the Balkans, and found that an average
improvement of 17.1% in forecasting accuracy is achieved when
compared to the naive selection of NN, and of 24.5% when compared to radial basis functions.
Our results seem to be comparable to those in the literature,
while using a NN model that has fewer layers and at least one
order of magnitude less trainable parameters, which makes our
model theoretically more reliable. Since our model is in its early
stages, there are still many points that can be improved, which is all the more promising.
REFERENCES
[1] G. A. Cortés, F. M. Álvarez, A. M. Esteban, and J. Reyes.
Knowledge-Based Systems, 2016.
[2] I. Goodfellow, Y. Bengio, and A. Courville. MIT press, 2016.
[3] A. Panakkat and H. Adeli. International journal of neural
systems, 2007.
[4] J. Reyes, A. M. Esteban, and F. M. Álvarez. Applied Soft
Computing, 2013.
[5] M. H. J. Saldanha and Y. Hirata. Chaos: An Interdisciplinary
Journal of Nonlinear Science, 2022.
[6] M. H. J. Saldanha and Y. Hirata. Journal of Physics: Complexity, 2024.
[7] V. N. Vapnik. John Wiley and Sons, 1998.
Earthquakes continue to cause numerous deaths and financial losses throughout the years, and it is unlikely that the scientific community will find a solution to this problem in the near future. Forecasting earthquakes is our best bet to prevent the loss of lives, as it enables the evacuation of potentially affected areas in advance. Yet, this has proven remarkably difficult to achieve, despite many years of effort by the scientific community. With the sharp rise of machine learning and artificial intelligence over the past decade, it is increasingly important to leverage the power of such technologies to improve earthquake forecasting, and in this study, we investigate the use of neural networks (NNs) for such purpose.
BACKGROUND
There are many attempts to use NNs in this field [1,4,3], but with a tendency to repeatedly add additional layers of neurons. The theory says that this makes it increasingly difficult for the NNs to learn, due to the increase on its Vapnik-Chervonenkis (VC) dimension [7]. We thus argue that a careful selection of features and a parsimonious choice of NN architecture can yield more reliable results, easier interpretation of the importance of each feature, and an overall better understanding of the resulting forecasting model obtained after training the network.
METHODS
In this study, we focus on predicting the maximum magnitude on each day, following recent work [5,6]. We then report the results of our prediction framework, which is founded in two main novel ideas (Fig. 1):
1. calculation of seismicity indicator features over multiple
lengths of time-windows; and
2. separating the first layer of the NN in subnetworks, each
dedicated to processing one particular time-window length.
The seismicity indicators that we calculate over a time-window
of length N days are:
- T-value: the time spanned by all earthquakes above a certain
threshold.
- Mean magnitude.
- Rate of the squared root of seismic energy released.
- Time since the first earthquake above a certain magnitude threshold k.
- Slope b and intercept a of the Gutenberg–Richter (GR) law
fitted to the events in the time-window.
- Sum of square errors of the magnitudes to the ideal line of
the GR law.
- Magnitude deficit: the difference between the expected maximum magnitude and the observed maximum magnitude.
- Coefficient of variation of inter-event times after removing
earthquakes with magnitude below a certain threshold k.
To make a prediction, we calculate the seismicity indicators
above for time-windows of size 7, 15, 30, 60, 90, and 180 days
(see Fig. 1). A naive approach would feed them all to a fully
connected multi-layer NN. This forces the network to look at all
the features simultaneously, whereas we believe it is more fruitful to let it look at the features of each time window separately.
In our hierarchical model, features of the time-window of
length N are fed to their own exclusive NN, and only then the
output of these subnetworks is fed to a fully-connected layer of
simple neuron units, and a final layer yields a prediction for the
earthquake magnitude in the following day.
RESULTS
We have applied the proposed model to earthquake catalogs of
New Zealand, Japan (Fig. 1) and the Balkans, and found that an average
improvement of 17.1% in forecasting accuracy is achieved when
compared to the naive selection of NN, and of 24.5% when compared to radial basis functions.
Our results seem to be comparable to those in the literature,
while using a NN model that has fewer layers and at least one
order of magnitude less trainable parameters, which makes our
model theoretically more reliable. Since our model is in its early
stages, there are still many points that can be improved, which is all the more promising.
REFERENCES
[1] G. A. Cortés, F. M. Álvarez, A. M. Esteban, and J. Reyes.
Knowledge-Based Systems, 2016.
[2] I. Goodfellow, Y. Bengio, and A. Courville. MIT press, 2016.
[3] A. Panakkat and H. Adeli. International journal of neural
systems, 2007.
[4] J. Reyes, A. M. Esteban, and F. M. Álvarez. Applied Soft
Computing, 2013.
[5] M. H. J. Saldanha and Y. Hirata. Chaos: An Interdisciplinary
Journal of Nonlinear Science, 2022.
[6] M. H. J. Saldanha and Y. Hirata. Journal of Physics: Complexity, 2024.
[7] V. N. Vapnik. John Wiley and Sons, 1998.