3:00 PM - 3:15 PM

# [SSS14-06] On Statistical Hypothesis Testing of Earthquake Precursory Phenomena

Keywords:Earthquake Precursory Phenomena

There are wide variety of phenomena which possibly precede large earthquakes. Physical mechanism between these phenomena and earthquakes are often unclear, but if the link between them were statistically strong, these phenomena could be utilized in practical earthquake forecast. In this talk, we argue that statistical tests have been misused in some of the previous studies on possible precursory phenomena. We present examples from electromagnetic studies, but the lessons would be applied to other observations as well.

We argue two improper statistical hypothesis testing methods, "data snooping" and multiple testing. As Love and Thomas [2013] pointed out, we are obliged to have statistical inference before looking at the data in statistical testing. When statistical inference is set after looking at the data, this test is never validated, and such inference is referred as data snooping. Unless the hypothesis is tested with another set of data, we should refrain from making conclusive statements on its statistical significance.

Type I error is so-called "false positive", in which a true null hypothesis is incorrectly rejected. We set significance level to a small number to reduce such possibility. When we perform large number of statistical test concurrently, the expected value that we encounter Type I error increases proportionally. As Love and Thomas [2013] showed, when physical mechanism between the phenomena and earthquakes is not known, we are forced to use many sets of parameters and repeat tests. Proper corrections, such as Bonferroni correction, will reduce occurrence of Type I error, but often are not used in previous publications of precursory phenomena. Additionally, when specific combination of parameter is used without theoretical basis or strong rationale, it could be safe to assume that use of multiple testing is properly documented in such publications so that possibility of false positive remains regardless of their conclusion.

We argue two improper statistical hypothesis testing methods, "data snooping" and multiple testing. As Love and Thomas [2013] pointed out, we are obliged to have statistical inference before looking at the data in statistical testing. When statistical inference is set after looking at the data, this test is never validated, and such inference is referred as data snooping. Unless the hypothesis is tested with another set of data, we should refrain from making conclusive statements on its statistical significance.

Type I error is so-called "false positive", in which a true null hypothesis is incorrectly rejected. We set significance level to a small number to reduce such possibility. When we perform large number of statistical test concurrently, the expected value that we encounter Type I error increases proportionally. As Love and Thomas [2013] showed, when physical mechanism between the phenomena and earthquakes is not known, we are forced to use many sets of parameters and repeat tests. Proper corrections, such as Bonferroni correction, will reduce occurrence of Type I error, but often are not used in previous publications of precursory phenomena. Additionally, when specific combination of parameter is used without theoretical basis or strong rationale, it could be safe to assume that use of multiple testing is properly documented in such publications so that possibility of false positive remains regardless of their conclusion.