10:45 AM - 12:15 PM
[HCG22-P04] Architecture of the convolutional neural network suitable for the automatic identification of trace fossils to evaluate bioturbation intensity
Keywords:bioturbation, semantich segmentation, U-Net, core section image analysis, Ichnology
Here we examined various U-Net-type encoder-decoder models for exploring the architectures of CNN suitable for identifying trace fossils in core section images. After the CNN models are trained to learn relationships between the training datasets of core section images and manually segmented images, they can be used for predicting the distribution of the trace fossils in the core images. Although various architectures for the semantic segmentation of images have been proposed in recent years, there have been few attempts to apply them to trace fossil extraction.
This study used the core section images of IODP Expedition 362 Site U1480, mainly composed of the Miocene to Pleistocene submarine fan deposits. Five core section images in Hole F were manually painted in three colors, representing the regions of background, trace fossils, and outcrop. Each image was trimmed into 2,160 tiles of images that were 224 × 224 in size, and 1,434 and 726 tiles were provided for training and validation datasets, respectively.
Four U-Net-type CNNs were examined: the normal U-Net, Deep ResUNet, Attention U-Net, and Attention ResUNet. The U-Net is one of the encoder-decoder type CNN, which has the skip connections between the encoder and decoder networks to reconstruct the position of each pixel during upsizing feature maps. The Deep ResUNet has residual connections between input and output in each convolutional block, which prevents the gradient vanishing during backpropagation. The Attention U-Net has the attention mechanism in the decoder network during upsizing feature maps, suggesting where to pay attention to classify pixels. The Attention ResUNet is a hybrid of the Deep ResUNet and the Attention U-Net. For each model, the Tanimoto loss with complements and the RAdam were employed as a loss function and an optimizer, respectively. Training of each model was conducted in 150 epochs. The model performance was assessed by the Dice coefficient D at the last epoch, and RMSE between the bioturbation intensity in the manually painted and predicted images was calculated.
As a result, the Attention ResUNet showed the best performance among the models. The loss and D at the last epoch for the validation dataset were almost the same in each model, ranging 0.50–0.55 and 0.86–0.87, respectively. In contrast, RMSE in the Attention ResUNet and the normal U-Net were 0.0051 and 0.0105, respectively. This result suggests that the estimation by the Attention ResUNet was two times as accurate as that by the normal U-Net. The Deep ResUNet and Attention ResUNet may have an advantage in the scale of the network because the number of trainable parameters in the networks was larger than others due to the employment of the residual connections. In addition, the attention mechanisms in the Attention ResUNet could reduce noise and misclassification at the complex boundary between trace fossils and outcrops. In future work, the models are expected to be applied to the submarine fan deposits in Site U1480 and cores from other wells to investigate the general trend in variation of the bioturbation intensity.