Presentation information

General Session

General Session » J-1 Fundamental AI, theory

[4B3-GS-1] Fundamental AI, theory (2)

Fri. Jun 12, 2020 2:00 PM - 3:40 PM Room B (jsai2020online-2)


3:20 PM - 3:40 PM

[4B3-GS-1-05] Hessian spectral analysis for adaptive optimizers of neural networks

〇Tetsuya Motokawa1, Taro Tezuka1 (1. University of Tsukuba)

Keywords:deep learning, optimization, hessian matrix, loss surface analysis

When training neural networks, adaptive optimization methods such as Adam is widely used because of their fast convergence.
On the other hand, it has been pointed out that the parameters obtained by these adaptive methods do not generalize as much as ones obtained by SGD.
The mechanism behind this difference is still not fully understood.
We analyzed convergence points reached by adaptive and non-adaptive methods using the Hessian spectrum of the loss function with respect to parameters.
Experiments showed that SGD tends to converge to flatter locations than adaptive optimizers do.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.