1:45 PM - 2:00 PM
[AAS05-01] Earth system model replicability - Statistical validation of a model's climate under a change of computing environment
Keywords:Climate model evaluation, climate model replicability, climate model validation
How do we ensure the model’s climate is replicable? It has been demonstrated that the non-linearity of the models leads to significantly different trajectories, even for a different set of compiler flags. If replicability does not hold, differences between contrasting projection scenarios performed on different clusters cannot be interpreted exclusively in terms of the changes in external forcing. Built on state-of-the-art replicability verification methods, we developed a methodology to evaluate the statistical power and sensitivity of current replicability methods and made improvements based on our results. One of our findings is that the power of the current methods is poor when the effective differences are subtle. For instance, the standard threshold of 80% of the statistical power of a prominent method (Massonnet et al., 2020, https://doi.org/10.5194/gmd-13-1165-2020) is only met if the ensemble means of the evaluation metric are more than two standard deviations apart. However, we observed differences in biomass burning emission (BMB) forcing in the CESM2 Large Ensemble Community Project (LENS2), which changes the model’s climate, only show effective differences of about 0.5 standard deviations.
Our new methodology provides (1) new metrics capable of resolving the differences in the data better, meaning increasing the effect size, and (2) new statistical tests that trigger smaller effect sizes. We will present our new methodology evaluating the LENS2 ensemble, based on the period affected by the different BMB forcing, and analyzing different perturbation schemes of the control ensemble initialized in 1850. Additionally, we will present a recent study that we performed on IFS-NEMO simulations, using a double, single, and mixed precision NEMO implementation.