JSAI2023

Presentation information

General Session

General Session » GS-2 Machine learning

[1B5-GS-2] Machine learning

Tue. Jun 6, 2023 5:00 PM - 6:40 PM Room B (Civic hall B)

座長:松野 竜太(NEC) [現地]

5:20 PM - 5:40 PM

[1B5-GS-2-02] Explicit modeling of the implicit regularization effect of SGD

〇Shota Nakamura1, Rio Yokota1 (1. Tokyo Institute of Technology)

Keywords:Deep learning, Optimization, Distributed Parallel Learning

Distributed parallel learning is needed due to the growth of deep learning models and datasets. Data parallelization is the easiest distributed learning method to implement, where each GPU has redundant models and batches are distributed. However, as the number of GPUs increases, the batch size increases proportionally and the generalization performance deteriorates due to the loss of the implicit regularization effect of SGD. In this study, we aim to alleviate this large-batch problem by regularizing by the gradient norm.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password