Improving Object Coverage of Text-to-Image Generation by Object Matching

Shogo Ishii

10:20 AM - 10:40 AM

[2O1-GS-7-05] Improving Object Coverage of Text-to-Image Generation by Object Matching

〇Shogo Ishii¹, Tomoaki Yamazaki¹, Seiya Ito¹, Kouzou Ohara¹ (1. Aoyama Gakuin University)

Keywords:Text-to-Image, GANs, Object Detection

Text-to-image generation aims to generate images according to a given text describing scene information such as objects and scenery. The existing methods implicitly learn the correspondence relation between words in text and regions in an image from text-image pairs by an attention mechanism. However, the objects specified in the text often do not appear in the generated image. In this paper, we propose a text-to-image generation model that explicitly learns the correspondence relation between objects in the text and in the generated image to improve object coverage. The proposed method applies object detection to the generated image and promotes missing objects to appear in the image by introducing a loss function considering the completeness of the correspondence between objects in a text and in an image. We demonstrate our model outperforms existing methods in object coverage.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[2O1-GS-7] Vision, speech media processing: generation

[2O1-GS-7-05] Improving Object Coverage of Text-to-Image Generation by Object Matching

Password