JSAI2025

Presentation information

Organized Session

Organized Session » OS-45

[3R1-OS-45] OS-45

Thu. May 29, 2025 9:00 AM - 10:40 AM Room R (Room 805)

オーガナイザ:稲葉 通将(電気通信大学),東中 竜一郎(名古屋大学),徳久 良子(愛知工業大学/理化学研究所)

9:20 AM - 9:40 AM

[3R1-OS-45-02] Construction of a Persona Dialogue Dataset Collected from a Self-publishing Novel Website

〇Ryuichi Uehara1, Michimasa Inaba1 (1. University of Electro-Communications)

Keywords:Dialogue system, Dialogue dataset, Role-playing

In recent years, research on character role-playing using large language models (LLMs) has been actively conducted. One approach for evaluating role-playing ability is to verify whether an LLM can respond in line with a given persona, which consists of information about the character. In order to accurately assess the role-playing ability, it is important to have the LLM perform role-playing for characters it has not been trained on before. However, many of the datasets for role-playing tasks proposed so far contain characters from well-known works, which may have appeared frequently in the LLM's pre-training data. As a result, there is a risk that the LLM's ability to utilize the persona may not be evaluated accurately. To address this, we have constructed a persona-based dialogue dataset by collecting dialogue from 608 characters across 96 online novels, including lesser-known works. The experimental results show that fine-tuning is important for improving the role-playing ability of LLMs using personas. On the other hand, we found challenges in the generalization performance of role-playing abilities for characters not included in the training data. This suggests that the dataset could be useful for exploring learning methods to improve generalization performance.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password