JSAI2025

Presentation information

General Session

General Session » GS-10 AI application

[3L1-GS-10] AI application:

Thu. May 29, 2025 9:00 AM - 10:40 AM Room L (Room 1007)

座長:南部 優太(日本電信電話株式会社 人間情報研究所)

10:20 AM - 10:40 AM

[3L1-GS-10-05] Proposal of a Robust Multimodal Deep Learning Model for Domain Shift in Hysteroscopic Images

〇Utoku Kakiyama1, Kazunari Henmi1, Kouhei Miyata2, Yosihito Inoue3, Motoki Nabeta4, Fusanori Yotsumoto2, Chihiro Sibata1 (1. Hosei University, 2. Fukuoka University, 3. Inoue Zen Ladies Clinic, 4. Tsubaki Women's Clinic)

Keywords:multimodal model, domain shift, interpretability, medical

The recent advancements in deep learning have given rise to a range of medical applications, including those that support diagnosis. One such application is the use of hysteroscopic images for diagnostic purposes. While high classification accuracy has been achieved, the robustness of these models against domain shift remains uncertain, posing a challenge for clinical implementation. This study focuses on chronic endometritis, a condition of persistent endometrial inflammation. We propose CLIP-MLP, a method that first predicts lesion areas in hysteroscopic images using a deep learning model. Then, a multimodal model classifies the condition by integrating the original image with explanatory texts generated from the predictions. Experimental results demonstrate that CLIP-MLP outperforms image-only models in classifying unseen datasets, improving generalization and robustness against domain shift. This approach enhances the reliability of deep learning-based hysteroscopic diagnosis, facilitating its clinical adoption.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password