3:10 PM - 3:30 PM
[2D4-OS-18a-05] Information-Identifiability Perspective on Posterior Collapse and Conditional Mutual Information Maximization for Remedy
Keywords:deep generative models, variational autoencoder, posterior collapse
Variational Autoencoder (VAE) training suffers from posterior collapse, which means the decoders of VAEs ignore latent variables.
In this paper, we argue that {\em I-unidentifiable} data generating process, which is assumed by several existing VAEs, induces posterior collapse.
This is because in such an {\em I-unidentifiable} data generating process, the information that a particular latent variable is designed to acquire is easily acquired by other latent variables without sacrificing log-likelihood.
We show that this perspective gives a unified explanation for posterior collapse, using VAE with autoregressive decoder and disentangled sequential autoencoder as examples.
In addition, we propose maximizing conditional mutual information with adversarial training to alleviate the unidentifiability issue, which does not require specific constraints on model architectures or latent variable structures.
Empirically our method mitigated posterior collapse in the above two models and improved the rate-distortion curve.
In this paper, we argue that {\em I-unidentifiable} data generating process, which is assumed by several existing VAEs, induces posterior collapse.
This is because in such an {\em I-unidentifiable} data generating process, the information that a particular latent variable is designed to acquire is easily acquired by other latent variables without sacrificing log-likelihood.
We show that this perspective gives a unified explanation for posterior collapse, using VAE with autoregressive decoder and disentangled sequential autoencoder as examples.
In addition, we propose maximizing conditional mutual information with adversarial training to alleviate the unidentifiability issue, which does not require specific constraints on model architectures or latent variable structures.
Empirically our method mitigated posterior collapse in the above two models and improved the rate-distortion curve.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.