12:20 PM - 12:40 PM
[4A1-02] The Effectiveness of Joint Representation and the Extension to Unimodal Input \\ on Semi-Supervised Multimodal Deep Generative Models
Keywords:deep generative model, multimodal learning, semi-supervised learning
In recent multimodal learning, deep neural networks are increasingly used as discriminators. In general, we need a large amount of labeled dataset for training them, but it takes a human cost to label multimodal inputs. Therefore, semi-supervised learning on multimodal data becomes important. Among these methods, semi-supervised multimodal learning with deep generative models has recently been proposed. In this study, we first compare these methods and show that SS-HMVAE, which is a method with latent variables corresponding to joint representation, have high performance when different modalities have no deterministic relation in particular. Next, to predict labels from a unimodal data, we propose SS-HMVAE-kl that is an extended model of SS-HMVAE. We confirmed that this method greatly improves the performance when inputting a single modality compared with the conventional models.