Proposal of a Robust Multimodal Deep Learning Model for Domain Shift in Hysteroscopic Images

Utoku Kakiyama

10:20 AM - 10:40 AM

[3L1-GS-10-05] Proposal of a Robust Multimodal Deep Learning Model for Domain Shift in Hysteroscopic Images

〇Utoku Kakiyama¹, Kazunari Henmi¹, Kouhei Miyata², Yosihito Inoue³, Motoki Nabeta⁴, Fusanori Yotsumoto², Chihiro Sibata¹ (1. Hosei University, 2. Fukuoka University, 3. Inoue Zen Ladies Clinic, 4. Tsubaki Women's Clinic)

Keywords:multimodal model, domain shift, interpretability, medical

The recent advancements in deep learning have given rise to a range of medical applications, including those that support diagnosis. One such application is the use of hysteroscopic images for diagnostic purposes. While high classification accuracy has been achieved, the robustness of these models against domain shift remains uncertain, posing a challenge for clinical implementation. This study focuses on chronic endometritis, a condition of persistent endometrial inflammation. We propose CLIP-MLP, a method that first predicts lesion areas in hysteroscopic images using a deep learning model. Then, a multimodal model classifies the condition by integrating the original image with explanatory texts generated from the predictions. Experimental results demonstrate that CLIP-MLP outperforms image-only models in classifying unseen datasets, improving generalization and robustness against domain shift. This approach enhances the reliability of deep learning-based hysteroscopic diagnosis, facilitating its clinical adoption.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[3L1-GS-10] AI application:

[3L1-GS-10-05] Proposal of a Robust Multimodal Deep Learning Model for Domain Shift in Hysteroscopic Images

Password