Earth system model replicability - Statistical validation of a model's climate under a change of computing environment

Kai Keller; Marta Alerany; Leo Arriola; Mario Acosta

1:45 PM - 2:00 PM

[AAS05-01] Earth system model replicability - Statistical validation of a model's climate under a change of computing environment

*Kai Keller¹, Marta Alerany¹, Leo Arriola¹, Mario Acosta¹, Mario Acosta¹ (1.Barcelona Supercomputing Center)

Keywords:Climate model evaluation, climate model replicability, climate model validation

The sixth assessment report (AR6) issued by the Intergovernmental Panel on Climate Change (IPCC) projects that 1 in 50 years, heat waves become about 8 times more frequent, and 1 in 10 years, extreme precipitation events become twice as frequent in a 1.5-degree warmer climate compared to the pre-industrial period. We need to prepare and adapt to those changes in our global climate. The current best way to do this is to use Earth System Models (ESMs) to project the next 50 to 100 years of our climate, employing Greenhouse-gas (GHG) emission and anthropogenic aerosol-based scenarios. The most prominent initiative dedicated to this aim is the Coupled Model Intercomparison Project (CMIP). Reproducibility is important in this collaborative effort, tracing back simulations to specific configurations, model versions, and compilation flags to reproduce the same simulation in the same environment again, achieving identical results. Equally important is replicability, achieving “identical” results when performing the same experiment configuration using different clusters, computing environments, or compilers. Achieving replicable results is much more difficult, and in practice, bit-to-bit replicability can almost never be achieved. However, results can be replicable in the sense that the model's climate in one computing environment is statistically indistinguishable compared to results from simulations performed in another environment. Due to the large number of simulations conducted in CMIP, the simulations are usually distributed on different clusters.

How do we ensure the model’s climate is replicable? It has been demonstrated that the non-linearity of the models leads to significantly different trajectories, even for a different set of compiler flags. If replicability does not hold, differences between contrasting projection scenarios performed on different clusters cannot be interpreted exclusively in terms of the changes in external forcing. Built on state-of-the-art replicability verification methods, we developed a methodology to evaluate the statistical power and sensitivity of current replicability methods and made improvements based on our results. One of our findings is that the power of the current methods is poor when the effective differences are subtle. For instance, the standard threshold of 80% of the statistical power of a prominent method (Massonnet et al., 2020, https://doi.org/10.5194/gmd-13-1165-2020) is only met if the ensemble means of the evaluation metric are more than two standard deviations apart. However, we observed differences in biomass burning emission (BMB) forcing in the CESM2 Large Ensemble Community Project (LENS2), which changes the model’s climate, only show effective differences of about 0.5 standard deviations.

Our new methodology provides (1) new metrics capable of resolving the differences in the data better, meaning increasing the effect size, and (2) new statistical tests that trigger smaller effect sizes. We will present our new methodology evaluating the LENS2 ensemble, based on the period affected by the different BMB forcing, and analyzing different perturbation schemes of the control ensemble initialized in 1850. Additionally, we will present a recent study that we performed on IFS-NEMO simulations, using a double, single, and mixed precision NEMO implementation.

Presentation information

[A-AS05] Weather, Climate, and Environmental Science Studies using High-Performance Computing

[AAS05-01] Earth system model replicability - Statistical validation of a model's climate under a change of computing environment