Character Setting Extraction for Enhanced Character-LLM Evaluation

Haruhisa Kimoto; Yuta Hitomi; Daichi Sato; yugo atobe; masahiko koyama; Tomohiro Katada; Kei Hashimoto; giusto sara; Takayuki Moriya

[4Xin2-109] Character Setting Extraction for Enhanced Character-LLM Evaluation

〇Haruhisa Kimoto^1,2, Yuta Hitomi¹, Daichi Sato^1,3, yugo atobe¹, masahiko koyama¹, Tomohiro Katada¹, Kei Hashimoto¹, giusto sara¹, Takayuki Moriya¹ (1.Aww Inc. , 2.Ibaraki University, 3.The University of Tokyo)

Keywords:Character-LLM, Evaluation Metric, Virtual Human

Since Li et al.'s study, research on Character-Large Language Models (LLMs) engaging in character role-playing, termed Character-LLM, has progressed. Li et al. explored the reproducibility of 32 characters using two approaches: Retrieval Augmented Generation (RAG) and fine-tuning. Wang et al. proposed a method to quantitatively evaluate the personality traits of characters role-played by LLMs using psychological metrics such as the Big Five and MBTI. Shao et al. evaluated role-playing proficiency along five axes using ChatGPT. However, Wang's method only assesses personality trait similarity, while Shao's lacks clarity and may lead to black-box evaluation results. Japanese role-playing demands precise reproduction of various linguistic elements. This study proposes a method to automatically evaluate chatbots role-playing characters by extracting character settings from past utterances and conducting automatic evaluations. The experiment extracted 54 character settings from imma's tweet data, achieving a macro-averaged precision points in automatic evaluations.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[4Xin2] Poster session 2

[4Xin2-109] Character Setting Extraction for Enhanced Character-LLM Evaluation

Password