Analysis of Applying LLM to the Summary Evaluation Metrics Based on Asking and Answering Questions

Masaki Yamagata; Taku Kato; Hiroshi Fujimoto; Takeshi Yoshimura

[1Win4-37] Analysis of Applying LLM to the Summary Evaluation Metrics Based on Asking and Answering Questions

〇Masaki Yamagata¹, Taku Kato¹, Hiroshi Fujimoto¹, Takeshi Yoshimura¹ (1.NTT DOCOMO, INC.)

Keywords:Summary Evaluation, Large Language Model

In recent years, research on machine learning-based text summarization has been active, and its evaluation methods have also been studied. However, because sentence summary evaluation requires consideration of context and meaning, it has not yet been possible to construct an evaluation method that has a high correlation with human evaluation. Furthermore, the degree of difficulty increases further in the case of evaluation without using reference sentences. In this study, we applied LLM to the question generation model and the question-answering model included in QAGS (A. Wang et al., 2020), which is an evaluation index for document summarization without using reference sentences, to Japanese, and analyzed the correlation with manual evaluation. Through experiments, we confirmed that the use of LLM in the QAGS question generation and question-answering models enables question generation and question-answering in Japanese with consideration of context and meaning, and enables sentence summary evaluation that is correlated with manual evaluation.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[1Win4] Poster session 1

[1Win4-37] Analysis of Applying LLM to the Summary Evaluation Metrics Based on Asking and Answering Questions

Password