Extending deep learning design methodology based on scaling laws in signal propagation processes

Keiichi Tamai

2:20 PM - 2:40 PM

[4I3-GS-11-02] Extending deep learning design methodology based on scaling laws in signal propagation processes

〇Keiichi Tamai¹, Tsuyoshi Okubo¹, Truong Vinh Truong Duy², Synge Todo¹ (1. The University of Tokyo, 2. Aisin Corp.)

Keywords:deep learning, neural tangent kernel, scaling laws

Establishing a systematic design methodology for deep neural networks is an unavoidable challenge in developing deep learning technology sustainably for human society. In the last JSAI, we reported evidence demonstrating that scaling laws observed in the signal propagation process of deep neural networks are useful for selecting initialization conditions, learning rates, and the depth of hidden layers. However, the previous report dealt only with the case of multilayer perceptrons and low-dimensional input data, creating a significant gap between the applied architectures and datasets. The purpose of this paper is to narrow this gap. First, we extend the previous method, which incorporated scaling laws of signal propagation processes into eigenvalue analysis of the Neural Tangent Kernel (NTK), to the case of high-dimensional input data, and discuss the relationship between the optimal number of hidden layers and the amount of training data needed for learning and input data dimensions. Subsequently, for networks with skip connections, such as ResNet, we reveal the qualitative changes in the dynamics of signal propagation processes and the differences in NTK eigenvalue spectra, discussing the role of skip connections in deep learning.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[4I3-GS-11] AI and Society:

[4I3-GS-11-02] Extending deep learning design methodology based on scaling laws in signal propagation processes

Password