JSAI2024

Presentation information

Poster Session

Poster session » Poster session

[3Xin2] Poster session 1

Thu. May 30, 2024 11:00 AM - 12:40 PM Room X (Event hall 1)

[3Xin2-19] Dialogue Text Auto Augmentation using Large Language Model and Policy Exploration

〇Takashi Ushio1, Haruo Fujiwara1, Hiroshi Kato1, Yutaka Yagi2 (1.Hakuhodo DY Holdings, 2. Picolab Co., Ltd.)

Keywords:Spoken Language Understanding, Text Augmentation, Large Language Model, Bayesian Optimization

In recent years, it has become easier to collect spoken dialogue data through remote conferencing applications, and dialogue analysis is becoming increasingly popular. One of the techniques is Spoken Language Understanding (SLU), which classifies speech by its intent, and models based on supervised learning have been reported to be highly accurate.
Since supervised learning requires data specific to the dialogue domain, it suffers from overlearning in low-resource or class-imbalanced situations, and data augmentation to enrich training samples is widely used as a countermeasure. However, conventional data augmentation methods such as synonym substitution often fail to improve performance in short sentences because the conversion result differs from the original meaning.
To overcome the above problems, we propose a method for searching for multiple dialogue data augmentation using a large-scale language model (LLM) and their combinations. Specifically, we consider a combination of operations as an augmentation policy, and perform policy search by Bayesian optimization to improve model performance. Experimental results on a business dialogue dataset show that the proposed method is superior to both supervised learning and LLM zero shot learning.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password