JSAI2022

Presentation information

Organized Session

Organized Session » OS-19

[2M1-OS-19a] 世界モデルと知能(1/4)

Wed. Jun 15, 2022 9:00 AM - 10:40 AM Room M (Room B-2)

オーガナイザ:鈴木 雅大(東京大学)、岩澤 有祐(東京大学)[現地]、河野 慎(東京大学)、熊谷 亘(東京大学)、森 友亮(スクウェア・エニックス)、松尾 豊(東京大学)

9:40 AM - 10:00 AM

[2M1-OS-19a-03] A Deep Generative Model for Extracting Shared and Private Latent Representations from Multimodal Data

〇Kaito Kusumoto1, Shingo Murata1 (1. Keio University)

Keywords:Deep Generative Model, Multimodal, Variational Autoencoder, Shared-Private Representation

Representation learning of multi-modal data has a potential to understand a shared structure across modalities. The objective of this study is to develop a computational framework that can learn to extract latent representations from multi-modal data by using a deep generative model. A particular modality is considered to hold low-dimensional latent representations; however, these representations are not always fully shared with another modality. Therefore, we assume that each modality holds both shared and private latent representations. Under this assumption, we propose a deep generative model that can learn to extract these different latent representations from both non-time-series and time-series data in an end-to-end manner. To evaluate this framework, we conducted a simulation experiment in which an artificial multi-modal dataset consisting of images and strokes with shared and private information was utilized. Experimental results demonstrate that the proposed framework successfully learned to extract both the shared and private latent representations.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password