2:50 PM - 3:10 PM
[2U4-IS-2c-05] Rapid training of Perceiver in a low-cost computing environment
[[Online, Regular]]
Keywords:Deep Learning, Image Recognition, Foundation Models, Attention Mechanism
Perceiver is a deep learning model that can be applied to a variety of modalities. It can simultaneously process various forms of input and output, such as images, speech, and natural language using the same architecture. However, Perceiver is computationally more expensive than other models. Therefore, training the model in environments with relatively limited fast parallel computational resources is relatively difficult. In this study, we aimed to reduce the computational cost such that learning can be performed in a short time in environments other than large-scale computing systems. To this end, we first show that a speed-up method proposed for Transformer is also effective for Perceiver. In particular, the gated attention unit proposed for FLASH reduces computational complexity without sacrificing accuracy. The proposed acceleration method can achieve accuracy comparable to that of the original model in a limited computing environment. As an introductory example, we conducted experiments using the ImageNet image recognition task and demonstrated that the proposed method can reduce the training time compared to conventional methods without a significant loss of accuracy. This model can be used to input and output any kind of data quickly in a low-cost computing environment.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.