*Mayu Tsuchiya1, Hiroyuki Nagahama2, Jun Muto2, Mitsuhiro Hirano2, Yumi Yasuoka3
(1.Department of Geoenvironmental Science, Faculty of Science , Tohoku University , 2.Department of Earth Science, Tohoku University, 3.Radiation Control Room, Faculty of Pharmaceutical Sciences, Kobe Pharmaceutical University)
Keywords:Machine learning, Random forest analysis, Radon, The 2011 off the Pacific coast of Tohoku Earthquake, The Great Hanshin Earthquake
Many researches have been conducted on detecting anomaly using geochemical signals. Among them, radon, a radioactive element has been utilized to seek for earthquake precursors. It has been pointed out that radon may change its dynamics before an earthquake occurs, and several previous studies have reported that radon concentrations in the atmosphere fluctuate before earthquakes. Among them, the atmospheric radon concentrations observed at Fukushima Medical University (FMU) before the 2011 Tohoku-oki Earthquake and at Kobe Pharmaceutical University (KPU) before the 1995 Kobe Earthquake were found to increase at each radioisotope facility. Iwata et al. (2018) detected anomalies in atmospheric radon concentrations related to earthquakes by singular spectrum transformation; non-parametric analysis to estimate change points in time series. However, the results of this method depend on how the parameters are determined, and their validity could not be evaluated. Therefore, in this study, Random Forest analysis was conducted to provide further objectivity to the detection capabilities of anomalies in atmospheric radon concentrations that have been conducted in previous studies. Random forest analysis is a method in which samples are randomly determined to build a model, and predictive results can be obtained easily using a computer. In the analysis, the teacher (training) data of the atmospheric radon concentration was determined for the period of no big earthquakes in studied area and the subsequent period as the test data. The predicted values of atmospheric radon concentration obtained from the period of teacher (training) data period were compared with the observed values. The coefficient of determination calculated from the difference between observed and predicted values were used to evaluate the prediction capability. Atmospheric radon concentrations in the periods of 2002-2007 and 1984-1989 were set as teacher data for FMU and KPU, respectively. Also, those of 2008-2011 and 1990-1995 were set as training data for FMU and KPU, respectively. The explanatory variable was the date when radon concentration in the atmosphere was observed. A prediction model was then created by randomly selecting 70% of the teacher data and using the remaining 30% of the data to obtain predictions. Then the predicated values are compared to the observed data by coefficient of determinations. The coefficient of determination values for FMU were lower than those of the predictive model during the predicted data period, especially significantly lower in 2011. This indicates that during the predicted data period, atmospheric radon concentrations have fluctuated significantly from the normal fluctuations. These facts suggest that the atmospheric radon concentration after 2010 was an anomaly. The graph comparing the predicted and observed values of KPU shows that the observed values greatly exceeded the predicted values from the end of 1994 to the beginning of 1995. The difference between observed and predicted values was more than three times the standard deviation of the difference between observed and predicted values at the end of 1994. These results suggest that radon concentration in the atmosphere increased at the end of 1994 as a precursor to Kobe Earthquake and that Random Forest analysis was able to detect this precursor. The results of the above two analyses of pre-earthquake atmospheric radon concentrations clearly show that Random Forest analysis can objectively detect changes in atmospheric radon concentrations as a precursor to an earthquake by comparing them with actual observed values. This method has the potential to be used for earthquake prediction by capturing the phenomena that precede earthquakes.