10:20 AM - 10:40 AM
[3L1-GS-10-05] Proposal of a Robust Multimodal Deep Learning Model for Domain Shift in Hysteroscopic Images
Keywords:multimodal model, domain shift, interpretability, medical
The recent advancements in deep learning have given rise to a range of medical applications, including those that support diagnosis. One such application is the use of hysteroscopic images for diagnostic purposes. While high classification accuracy has been achieved, the robustness of these models against domain shift remains uncertain, posing a challenge for clinical implementation. This study focuses on chronic endometritis, a condition of persistent endometrial inflammation. We propose CLIP-MLP, a method that first predicts lesion areas in hysteroscopic images using a deep learning model. Then, a multimodal model classifies the condition by integrating the original image with explanatory texts generated from the predictions. Experimental results demonstrate that CLIP-MLP outperforms image-only models in classifying unseen datasets, improving generalization and robustness against domain shift. This approach enhances the reliability of deep learning-based hysteroscopic diagnosis, facilitating its clinical adoption.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.