JSAI2019

Presentation information

General Session

General Session » [GS] J-9 Natural language processing, information retrieval

[2L1-J-9] Natural language processing, information retrieval: fusion with image

Wed. Jun 5, 2019 9:00 AM - 10:00 AM Room L (203+204 Small meeting rooms)

Chair:Chiaki Miyazaki Reviewer:Tomoya Yoshikawa

9:20 AM - 9:40 AM

[2L1-J-9-02] A Study on Behavior of Deep Neural Text-to-Image Generative Model

〇Chihiro Fujiyama1, Ichiro Kobayashi1 (1. Ochanomizu University)

Keywords:Grounding between language and images, Deep Learning, Image Generation

In this study, we analyze the behavior of the computational mechanism and the structure of the feature representation space in a deep neural text-to-image generative model. This is a fundamental approach with a goal to construct artificial general intelligence reflecting the mechanism of human intelligence. First, we explore whether the model is capable of encoding captions and of generating valid images under the circumstance given input captions without word boundaries. Qualitative and quantitative evaluations demonstrate that it can generate compelling images, but the computational mechanism does not acquire the units of meaning. Secondly, we analyze the semantic compositionality in the embedding space. Our experimental result suggests that the semantic compositionality appears between words indicating positions.