JSAI2019

Presentation information

General Session

General Session » [GS] J-2 Machine learning

[2H3-J-2] Machine learning: selective preprocess

Wed. Jun 5, 2019 1:20 PM - 3:00 PM Room H (303+304 Small meeting rooms)

Chair:Yoji Kiyota Reviewer:Satoshi Oyama

1:40 PM - 2:00 PM

[2H3-J-2-02] Adaptive Learning Rate Adjustment with Short-Term Pre-Training in Data-Parallel Deep Learning

〇Kazuki Yamada1, Haruki Mori1, Tetsuya Youkawa1, Yuki Miyauchi1, Shintaro Izumi1, Masahiko Yoshimoto1, Hiroshi Kawaguchi1 (1. Kobe University)

Keywords:deep learning, learning rate, data-parallel, hyperparameter

This paper describes short-term pre-training (STPT) algorism to adaptively select an optimum learning rate (LR). The proposed STPT algorism is beneficial for quick model prototyping in data-parallel deep learning. It adaptively finds an appropriate LR from multiple LR sets by STPT, which means the multiple LRs are evaluated within the beginning few iterations in an epoch. The STPT short cuts the tuning process of LRs that is requested in conventional training procedure as hyper-parameter tuning, even if the unknown models are considered. Therefore, the proposed STPT reduces computational time and increases throughput to find the best LR for network training. This algorism reduces the computational time by 87.5% than the conventional method when the eight-LR sets are evaluated using eight-parallel workers. We verified the accuracy improvement by 4.8 % compared with the conventional one with a reference LR of 0.1; there are no accuracy deterioration is observed. In this algorism, better training convergence is shown and expresses the advantage in terms of training time especially for the unknown models than other cases such as fixed LR.