JSAI2023

Presentation information

Organized Session

Organized Session » OS-1

[4I3-OS-1b] AutoML(自動機械学習)

Fri. Jun 9, 2023 2:00 PM - 3:40 PM Room I (B2)

オーガナイザ:大西 正輝、日野 英逸

3:20 PM - 3:40 PM

[4I3-OS-1b-05] Mode-Adaptive Transformer by Automatic Optimization of the Receptive Field

〇Takuya Asakura1, Nakamasa Inoue1, Rio Yokota1, Koichi Shinoda1 (1. Tokyo Institute of Technology)

Keywords:Transformer, Multi Layer Perceptron, AutoML

The Vision Transformer (ViT), which uses Attention instead of convolution for feature extraction, has demonstrated high performance in the field of image processing. This result shows that the Transformer can be used for both time-series and images, and is expected to be a versatile model that is independent of the mode of data. However, many of the studies derived from ViT have narrowed the receptive field for feature extraction, and their adaptability to time-series such as speech is compromised. In this paper, we propose a method to adaptively optimize the receptive fields for a given mode of data. We developed a model using the proposed method and conducted experiments on two types of data, images and speech, and found that the proposed method outperforms conventional methods for both. The visualization shows that the proposed method can acquire a suitable receptive field depending on the mode of the given data.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password