10:30 AM - 12:10 PM
[3Rin2-02] Bi-directional multimodal generetion via estimating conditional distribution of latent variables obtained from pre-trained generative models
Keywords:Deep Learning, Multimodal, Bi-directional generation
In recent years, research on multimodal generation that mutually converts between different data such as images and sentences has attracted attention from the viewpoint of applicability to real service such as automatic annotation of images and subtitles of audio.
Meanwhile, in the field of machine learning research, reusable trained models trained using large-scale data sets are being opened to the public, and the number is expected to increase in the future.
Therefore, in this research, we aim to realize multimodal generation with small data by utilizing this trained model.
In this paper, we propose a multimodal generation method using a trained generation model in which latent variables of individual modality can be inferred and a small amount of data set. We realized multimodal generation by estimating the conditional distribution of latent variables obtained from trained models using small number of train data.
Meanwhile, in the field of machine learning research, reusable trained models trained using large-scale data sets are being opened to the public, and the number is expected to increase in the future.
Therefore, in this research, we aim to realize multimodal generation with small data by utilizing this trained model.
In this paper, we propose a multimodal generation method using a trained generation model in which latent variables of individual modality can be inferred and a small amount of data set. We realized multimodal generation by estimating the conditional distribution of latent variables obtained from trained models using small number of train data.