JSAI2019

Presentation information

Interactive Session

[3Rin2] Interactive Session 1

Thu. Jun 6, 2019 10:30 AM - 12:10 PM Room R (Center area of 1F Exhibition hall)

10:30 AM - 12:10 PM

[3Rin2-02] Bi-directional multimodal generetion via estimating conditional distribution of latent variables obtained from pre-trained generative models

〇Shigeaki Imakiire1, Masanao Ochi1, Junichiro Mori1, Ichiro Sakata1 (1. University of Tokyo School of Engineering Department of Technology Management for Innovation)

Keywords:Deep Learning, Multimodal, Bi-directional generation

In recent years, research on multimodal generation that mutually converts between different data such as images and sentences has attracted attention from the viewpoint of applicability to real service such as automatic annotation of images and subtitles of audio.

Meanwhile, in the field of machine learning research, reusable trained models trained using large-scale data sets are being opened to the public, and the number is expected to increase in the future.

Therefore, in this research, we aim to realize multimodal generation with small data by utilizing this trained model.
In this paper, we propose a multimodal generation method using a trained generation model in which latent variables of individual modality can be inferred and a small amount of data set. We realized multimodal generation by estimating the conditional distribution of latent variables obtained from trained models using small number of train data.