JSAI2022

Presentation information

General Session

General Session » GS-7 Vision, speech media processing

[1O5-GS-7] Vision, speech media processing: clustering / generation

Tue. Jun 14, 2022 4:20 PM - 6:00 PM Room O (Room 510)

座長:吉田 周平(NEC)[遠隔]

4:40 PM - 5:00 PM

[1O5-GS-7-02] A Zero-Shot Instance Segmentation Method Using a Domain-Independent Image Classification Model

〇Masaya Oikawa1, Takuto Yamauchi1, Kenji Tei1 (1. Waseda University)

Keywords:Zero-shot Learning, Segmentation, Image Classification

Instance segmentation is a pixel-by-pixel segmentation technique for individual object instances. There have been many studies on building detection models from small datasets to solve the practical problem of insufficient training datasets. One of the studies is zero-shot learning, which is a task to correctly detect even unseen objects by transferring visual knowledge obtained from seen objects through intermediate representations such as word vectors. In this paper, we propose ZSIwithCLIP, which introduces a domain-independent image classification model (CLIP) to improve the class recognition accuracy of the zero-shot instance segmentation model (ZSI). We also conducted learning and inference experiments to examine the effectiveness of ZSIwithCLIP. As a quantitative evaluation, the overall accuracy of instance segmentation is slightly reduced, while as a qualitative evaluation, the accuracy of classifying semantically similar classes is improved for simple images with a small number of objects and little overlap between objects.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password