# [MGI39-03] Quantitative Logging Unit Classification with Hidden Markov Model

Keywords:Logging, Hidden Markov Model, Clustering

Logging data acquired during ocean drilling projects are usually clustered into log units by logging scientists. Unit classifications and their geological interpretations will be useful to understand the geological formation at the drilling site. However, such the logging classifications are somewhat subjective because they are not based on any statistical modelling, but relying on the manual inspection of the average, the minimum, and the maximum values of logging data. Therefore, where to divide units and how many clusters should be considered in total are always problematic. In this study, we try to develop statistical methods to classify logging data into log units solving those problems.

This study uses Hidden Markov Model (HMM) to classify logging data into log units. We consider that the hidden state corresponds to log units to be estimated. We assume gaussian distributions as the observable data generation probabilities. The average vector and covariance matrix characterize each log units. We estimate those parameters with Expectation-Maximization (EM) algorithm. When applying EM algorithm, we use K-means++ method to select initial values for the average vector. We determine the total number of clusters using the evidence values estimated through EM algorithm.

We applied HMM to several drilling sites around Japan. The total number of clusters are usually larger than the log units determined by onboard logging scientists. For easier geological interpretations, we apply hierarchical clustering to the estimated clusters. The distance between gaussian distributions are defined by the Earth Mover’s distance.

This study uses Hidden Markov Model (HMM) to classify logging data into log units. We consider that the hidden state corresponds to log units to be estimated. We assume gaussian distributions as the observable data generation probabilities. The average vector and covariance matrix characterize each log units. We estimate those parameters with Expectation-Maximization (EM) algorithm. When applying EM algorithm, we use K-means++ method to select initial values for the average vector. We determine the total number of clusters using the evidence values estimated through EM algorithm.

We applied HMM to several drilling sites around Japan. The total number of clusters are usually larger than the log units determined by onboard logging scientists. For easier geological interpretations, we apply hierarchical clustering to the estimated clusters. The distance between gaussian distributions are defined by the Earth Mover’s distance.