4:20 PM - 4:40 PM
[3G5-GS-6-03] Consideration of Quality Evaluation Criteria for Interjections in Dialogue Systems and an Attempt to Construct a Quality Evaluation System
Keywords:LLM, dialogue systems, Generative AI, Natural Language Processing, Interjections
In dialogue systems utilizing large language models (LLMs), generating high-quality interjections is essential to improve responsiveness and empathy. In this study, to support the development of dialogue systems, we constructed a system capable of quantitatively and automatically evaluating the quality of interjections. To examine quality evaluation criteria for interjections, we created dialogue scripts and conducted subjective evaluations using a pairwise comparison method for the interjections mentioned in them. Variations of interjections were prepared based on elements such as coherence with context, consistency of tone, and length. Additionally, we used ChatGPT as an automatic evaluation environment to build a system that scores the quality of interjections. As a result of subjective evaluations through pairwise comparisons, data on variations of interjections were obtained, revealing human evaluation scales for coherence with context, tone consistency, and length. Furthermore, we investigated the correlation between the results of the automatic evaluation system and subjective evaluations to assess the effectiveness of this automatic evaluation system.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.