JSAI2023

Presentation information

General Session

General Session » GS-5 Language media processing

[1T3-GS-6] Language media processing

Tue. Jun 6, 2023 1:00 PM - 2:40 PM Room T (Online)

座長:竹岡 邦紘(NEC) [オンライン]

2:20 PM - 2:40 PM

[1T3-GS-6-05] Prompt Optimization for Training Generalizable Language Models

〇Masaru Isonuma1,2, Junichiro Mori1,3, Ichiro Sakata1 (1. The University of Tokyo, 2. The University of Edinburgh, 3. RIKEN)

[[Online]]

Keywords:text generation, meta learning, bilevel optimization

Recently, instruction tuning has been attracting significant attention as a method for training generalizable language models (e.g., ChatGPT).
Although various prompts have been manually created for instruction tuning, it has not been clarified what kind of prompts are optimal for obtaining cross-task generalization ability.
This study presents \emph{instruction optimization}, which optimizes training prompts by leveraging bilevel optimization, and we clarify what kind of prompts are optimal for instruction tuning.
Experimental results demonstrate that instruction optimization enhances the diversity of prompts and improves the generalization performance in a zero-shot setting, whereas using the same examples rather than a variety of exemplars is more effective in a few-shot setting.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password