Japan Geoscience Union Meeting 2025

Presentation information

[J] Poster

M (Multidisciplinary and Interdisciplinary) » M-GI General Geosciences, Information Geosciences & Simulations

[M-GI29] Data-driven geosciences

Mon. May 26, 2025 5:15 PM - 7:15 PM Poster Hall (Exhibition Hall 7&8, Makuhari Messe)

convener:Kenta Ueki(Japan Agency for Marine-Earth Science and Technology), Shin-ichi Ito(The University of Tokyo), Keita Itano(Akita University), Masaoki Uno(Department of Earth and Planetary Science, Graduate School of Science, the University of Tokyo)

5:15 PM - 7:15 PM

[MGI29-P03] Addressing Challenges in Source Rock Type Classification Using Mineral Chemical Composition

*Keita Itano1, Masatsugu Itano1 (1.Akita University)

Keywords:Machine Learning, Compositional data, Missing data, Accessory mineral

The dating of detrital minerals in sediments and sedimentary rocks has been widely applied in various fields such as provenance studies, the reconstruction of orogenic histories, and ore exploration. Since the early 2000s, the widespread use of laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) has enabled the acquisition of large datasets of zircon and monazite ages. Interpreting these age data requires constraining the formation processes and source rock types of individual mineral grains.
The major and trace element compositions of minerals are sensitive to their formation mechanisms and environments (e.g., bulk chemical composition and coexisting mineral assemblages), thus providing valuable information about their formation processes and source rock types. Numerous geochemical indicators for source rock classification in detrital zircon have been proposed for detrital zircon study. Leveraging extensive datasets of zircon and monazite chemical compositions across various rock types, machine learning approach offer the potential to develop higher-precision classification models and uncover new patterns within the data.
Some features associated with the mineral composition such as compositional data, missing data, and imbalanced data might become obstacles for the classification task. This study evaluated the impact of data imbalance and examined several strategies for addressing it using zircon trace element data. We compared three approaches: (i) adjusting the contribution of each class to the loss function by weighting it according to the class sample size, (ii) oversampling the minority class through synthetic sampling to increase its sample size, and (iii) a combination of ensemble learning and undersampling to reduce the sample size of the majority class. We confirmed that all approaches were effective; however, the degree of undersampling has a significant impact on classification performance. Especially when the sample size of minority class was small (<100), excessive undersampling led to a significant decrease in classification accuracy.