4:10 PM - 4:30 PM
[3P5-OS-17a-03] Systematic design of artificial deep neural networks based on scaling laws in signal propagation
Keywords:deep learning, neural networks, gradient descent, non-equilibrium statistical mechanics
For more environmentally sustainable development of deep learning (DL) technologies, computational burden for tuning DL architectures should be reduced. This calls for more systematic strategies for finding an optimal set of hyperparameters which achieves a good balance between training speed and generalization performance. As a preliminary step toward this goal, we address the problem of how to tune fully-connected feedforward perceptrons in the so-called ``kernel regime'' in a systematic manner. By combining the existing theoretical tools, such as the Neural Tangent Kernel (NTK), and the analogy of the signal propagation dynamics with absorbing phase transitions,
we conduct thorough analysis of the training dynamics of the neural network, including the case with finite depth. As a result, a simple strategy for optimally tuning the initialization hyperparameters and the depth is proposed.
we conduct thorough analysis of the training dynamics of the neural network, including the case with finite depth. As a result, a simple strategy for optimally tuning the initialization hyperparameters and the depth is proposed.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.