JSAI2024

Presentation information

Organized Session

Organized Session » OS-24

[2M5-OS-24] OS-24

Wed. May 29, 2024 3:30 PM - 5:10 PM Room M (Room 53)

オーガナイザ:大西 正輝(産総研)、日野 英逸(統数研 / 理研AIP)

4:50 PM - 5:10 PM

[2M5-OS-24-05] Model compression of BERT with One-Shot NAS

〇Takumi Okamoto1, Rio Yokota1 (1. Tokyo Institute of Technology)

Keywords:NAS, BERT, Local feature, Model compression

In recent years, research has been conducted on language models with larger model sizes to improve model performance, but pre-training such models requires a large amount of time. To solve this problem, model compression has been studied as a method to reduce model size while maintaining model performance. Also, research has been conducted to improve the performance of language models by incorporating an architecture that can efficiently learn local features. Therefore, in this study, to search for model structures that can reduce the model size while maintaining performance, we conducted a neural architecture search (NAS), for architectures that can efficiently learn local features.
We evaluated the resulting models using the GLUE benchmark. We were able to reduce the number of model parameters by 46.1%, while increasing the average score by 0.5 compared to the BERT-base model.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password