JSAI2020

Presentation information

General Session

General Session » J-11 Robot and real worlds

[1Q3-GS-11] Robot and real worlds: Multimodal information

Tue. Jun 9, 2020 1:20 PM - 3:00 PM Room Q (jsai2020online-17)

座長:青島武伸(パナソニック株式会社)

2:00 PM - 2:20 PM

[1Q3-GS-11-03] Multimodal Learning by Interaction between Probabilistic and Deep Generative Models

〇Ryo Kuniyasu1, Tomoaki Nakamura1, Takayuki Nagai2, Tadahiro Taniguchi3 (1. The University of Electro-Communications, 2. Osaka University, 3. Ritsumeikan University)

Keywords:unsupervised learning, multimodal, probabilistic generative model, deep generative model

To realize human-like intelligence artificially, large-scale models are required for robots to understand their environment using multimodal information obtained by various sensors installed in the robots. Therefore, we have proposed models that enable robots to acquire languages and concepts by classifying the multimodal information. These models learn the relationship between the extracted features of each set of modality information based on the multimodal latent Dirichlet allocation (MLDA) in an unsupervised manner. However, this does not provide completely unsupervised learning because the feature extraction includes supervised learning. Moreover, the observations themselves cannot be generated because the feature extraction is irreversible. Therefore, in this study, we propose the multinomial variational autoencoder (MNVAE), and construct a model that integrates the MNVAE and MLDA using Symbol Emergence in Robotics tool KIT. We classify the multimodal information of images and words obtained from a robot using the integrated model, and subsequently demonstrate that the latent space suitable for classification can be learned and images can be generated from words.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password