JSAI2023

Presentation information

General Session

General Session » GS-2 Machine learning

[3E5-GS-2] Machine learning

Thu. Jun 8, 2023 3:30 PM - 4:50 PM Room E (A2)

座長:金森 憲太朗(富士通) [現地]

3:30 PM - 3:50 PM

[3E5-GS-2-01] End-to-end Training of Deep Boltzmann Machines Using Unbiased MCMC

〇Shohei Taniguchi1, Masahiro Suzuki1, Yusuke Iwasawa1, Yutaka Matsuo1 (1. The University of Tokyo)

Keywords:Boltzmann machine, unbiased MCMC

We address the problem of biased gradient estimation in deep Boltzmann machines (DBMs). The existing
method to obtain an unbiased estimator uses a maximal coupling based on a Gibbs sampler, but when the state
is high-dimensional, it takes a long time to converge. In this study, we propose to use a coupling based on the
Metropolis-Hastings (MH) and to initialize the state around a local mode of the target distribution. Because of
the propensity of MH to reject proposals, the coupling tends to converge in only one step with a high probability,
leading to high efficiency. We find that our method allows DBMs to be trained in an end-to-end fashion without
greedy pretraining. We also propose some practical techniques to further improve the performance of DBMs. We
empirically demonstrate that our training algorithm enables DBMs to show comparable generative performance to
other deep generative models, achieving the FID score of 10.33 for MNIST.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password