A Study on Behavior of Deep Neural Text-to-Image Generative Model

Chihiro Fujiyama

9:20 AM - 9:40 AM

[2L1-J-9-02] A Study on Behavior of Deep Neural Text-to-Image Generative Model

〇Chihiro Fujiyama¹, Ichiro Kobayashi¹ (1. Ochanomizu University)

Keywords:Grounding between language and images, Deep Learning, Image Generation

In this study, we analyze the behavior of the computational mechanism and the structure of the feature representation space in a deep neural text-to-image generative model. This is a fundamental approach with a goal to construct artificial general intelligence reflecting the mechanism of human intelligence. First, we explore whether the model is capable of encoding captions and of generating valid images under the circumstance given input captions without word boundaries. Qualitative and quantitative evaluations demonstrate that it can generate compelling images, but the computational mechanism does not acquire the units of meaning. Secondly, we analyze the semantic compositionality in the embedding space. Our experimental result suggests that the semantic compositionality appears between words indicating positions.

Presentation information

[2L1-J-9] Natural language processing, information retrieval: fusion with image

[2L1-J-9-02] A Study on Behavior of Deep Neural Text-to-Image Generative Model