Bi-directional multimodal generetion via estimating conditional distribution of latent variables obtained from pre-trained generative models

Shigeaki Imakiire

10:30 AM - 12:10 PM

[3Rin2-02] Bi-directional multimodal generetion via estimating conditional distribution of latent variables obtained from pre-trained generative models

〇Shigeaki Imakiire¹, Masanao Ochi¹, Junichiro Mori¹, Ichiro Sakata¹ (1. University of Tokyo School of Engineering Department of Technology Management for Innovation)

Keywords:Deep Learning, Multimodal, Bi-directional generation

In recent years, research on multimodal generation that mutually converts between different data such as images and sentences has attracted attention from the viewpoint of applicability to real service such as automatic annotation of images and subtitles of audio.

Meanwhile, in the field of machine learning research, reusable trained models trained using large-scale data sets are being opened to the public, and the number is expected to increase in the future.

Therefore, in this research, we aim to realize multimodal generation with small data by utilizing this trained model.
In this paper, we propose a multimodal generation method using a trained generation model in which latent variables of individual modality can be inferred and a small amount of data set. We realized multimodal generation by estimating the conditional distribution of latent variables obtained from trained models using small number of train data.

Presentation information

[3Rin2] Interactive Session 1

[3Rin2-02] Bi-directional multimodal generetion via estimating conditional distribution of latent variables obtained from pre-trained generative models