Japan Geoscience Union Meeting 2025

Presentation information

[E] Poster

M (Multidisciplinary and Interdisciplinary) » M-IS Intersection

[M-IS06] Evolution and variability of the Tropical Monsoon and Indo-Pacific climate during the Cenozoic Era

Thu. May 29, 2025 5:15 PM - 7:15 PM Poster Hall (Exhibition Hall 7&8, Makuhari Messe)

convener:Kenji Matsuzaki(Atmosphere and Ocean Research Institute, The university of Tokyo), Takuya Sagawa(Institute of Science and Engineering, Kanazawa University), Sze Ling Ho(Institute of Oceanography, National Taiwan University), Stephen J Gallagher(University of Melbourne)

5:15 PM - 7:15 PM

[MIS06-P10] Leveraging Big Data and Deep Learning for Quantifying XRF Core Scanning Data into Various Geological Proxies

*An-Sheng Lee1, Yu-Wen Pao2, Hsuan-Tien Lin2, Ya Hsuan Sofia Liou1 (1.Department of Geosciences and Research Center for Future Earth, National Taiwan University, 2.Department of Computer Science and Information Engineering, National Taiwan University)

Keywords:X-ray Fluorescence (XRF), Geochemistry, Machine Learning, Self-supervised Learning, Foundation Model, Spectral Analysis

X-ray fluorescence (XRF) core scanning is widely used in geological research due to its rapid, non-destructive, and high-resolution capabilities. Significant efforts have been made to quantify XRF measurements into various geological proxies; however, conventional quantification models remain largely project-specific. The variability in materials and target proxies across individual studies makes cross-project applications challenging, requiring future projects to gather substantial datasets to train accurate models.
To address this challenge, we employ self-supervised learning using a masked deep autoencoder architecture on a global collection of XRF data and geological proxies. Our objective is to develop a foundation model that overcomes project-specific limitations and continuously improves by integrating diverse datasets, including legacy cores.
Our initial results demonstrate the effectiveness of this approach. The foundation model is pre-trained on 54,643 spectra from marine sediments collected in high-latitude regions of the Pacific and Southern Oceans. This pre-training phase enables the model to develop a general understanding of XRF spectra, allowing it to recognize key spectral features. After fine-tuning with only one-third of the training data, the model outperforms conventional quantification methods in accuracy for calcium carbonate (CaCO3) and total organic carbon (TOC) measurements. Furthermore, it exhibits a 60% improvement in accuracy when tested on entirely unseen sediment cores located tens of kilometers away, demonstrating its strong generalizability.
To further scale up this approach, we have included legacy cores from additional oceanic and terrestrial regions, such as the Indian Ocean, Japan Sea, Arctic, and European and Patagonian lakes. Our collaboration with Kochi University provides access to a broader range of core samples. By expanding the database to incorporate diverse materials and machine settings, we aim to enhance the model’s adaptability. Ultimately, this approach seeks to extend beyond core scanning and facilitate advancements in all XRF-based measurement techniques.