*Ying Zhang1, Qinghua Huang1
(1.Department of Geophysics, School Earth and Space Sciences, Peking University)
Keywords:Earthquake prediction, Artificial Neural Networks, Evaluation metrics, Reference model
Artificial Neural Networks (ANNs) have been widely used for predicting the time, location, and magnitude of future earthquakes. Evaluation metrics are measurement tools to quantitively evaluate the performance of ANNs. The evaluation metrics in the training stages are used to optimize the classification algorithm, while the evaluation metrics in the testing stage are used as the evaluator to measure the effectiveness of ANNs when tested with unseen data. It is important to recognize the evaluation in the training phase is different from the evaluation of the final model. Moreover, these two evaluation metrics have reference models that play different roles. The reference model of evaluation metrics in the testing stage is the baseline or benchmark that the tested model is compared to. Some off-the-shelf machine learning evaluation metrics, such as receiver operating characteristics (ROC) curve and Precision-Recall (PRC) plot, are used to evaluate the performance of the earthquake prediction model. However, some of the researchers ignore the fact that the reference model of the ROC curve and PRC plot is a spatial invariant Poisson distribution and forget this null hypothesis when analyzing the results. The positive evaluations of ROC, PRC tests, and other evaluation metrics, which take spatial invariant Poisson distribution as the reference model, are insufficient to prove that the tested model is effective for predicting earthquakes both in space and time. The evaluation metrics in the training stages are also known as loss functions. The reference model of the loss function controls the weight of samples on the gradient of ANN during training. In this work, the punishments of positive and negative samples will be further weighted by prior probability provided by the statistical seismology model, thus introducing knowledge of statistical seismology into machine learning techniques. In this way, the ANN is designed to focus more on the hard examples of the introduced reference model. In this work, we choose smoothed seismicity model as the reference model to revise some traditional loss functions, including Cross Entropy, Balanced Cross Entropy, Focal Loss, and Focal Loss alpha, and take the estimated cumulative earthquake energy in the time-space unit (1°×1°×10 days) as the input of the Long-short Term Memory network to predict the earthquakes with M>= 5.0 in the whole Chinese Mainland. In the testing step, these models will be finally evaluated by the time-space version Molchan diagram. Results showed that our new and simple revised loss functions can significantly improve the power of the ANN-based earthquake model. Designing a more complex structure for ANN and neurons is not the only way to improve the performance of ANNs, and adapting the loss function or objective function to better match the actual application problem can also significantly improve model performance.