Image Generation reflecting the Meaning of Language that reveals Object's Attributes

Sayako Watanabe; Lis  Kanashiro Pereira; Ichioro Kobayashi

[2Yin5-22] Image Generation reflecting the Meaning of Language that reveals Object's Attributes

〇Sayako Watanabe¹, Lis Kanashiro Pereira¹, Ichioro Kobayashi¹ (1.Ochanomizu University)

Keywords:Grounding of adjective meaning, Text-to-Image, VAE

Although recent text-to-image models achieved great success on generating images from the description of an object, such as a bird with brown and black striped wings and a yellow beak", these models may still struggle to generate images based on the understanding of the attributes of the object. We propose a text-to-image model that better reflects the meaning of words that express an object's attribute (i.e., adjectives). More specifically, we consider the case where the vector representation of shoes' images are changed with four adjectives, i.e., sporty, comfortable, pointy, and open, and we generate images that better reflect the meaning of these adjectives.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[2Yin5] インタラクティブ2

[2Yin5-22] Image Generation reflecting the Meaning of Language that reveals Object's Attributes

Password