JSAI2025

Presentation information

Poster Session

Poster session » Poster Session

[2Win5] Poster session 2

Wed. May 28, 2025 3:30 PM - 5:30 PM Room W (Event hall D-E)

[2Win5-30] A Note on Improving Accuracy in Composed Image Retrieval through Training Data Generation

Introduction of a Counterfactual Image Generation Model with Text Refinement

〇Kenta Uesugi1, Naoki Saito1, Keisuke Maeda1, Takahiro Ogawa1, Miki Haseyama1 (1.Hokkaido University)

Keywords:Composed Image Retrieval, Counterfactual Image Generation, Data Augmentation

In this paper, we propose a training data generation method using a counterfactual image generation model for Composed Image Retrieval (CIR). CIR is a retrieval method that utilizes both images and text as queries, enabling the handling of nuanced information that is difficult to express with a single modality. It is an essential technique for efficient image data retrieval. However, training CIR models requires a large amount of triplet data, which consists of a reference image, modification text, and a target image. Constructing such datasets requires significant time and effort. To address this issue, we propose a method that introduces text refinement into a counterfactual image generation model to efficiently augment diverse triplet data. We conduct experiments with two types of datasets: real-world scene images and fashion item images. The results show that the augmented dataset generated by the proposed method is of sufficient quality to enhance the performance of CIR models.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password