Study of coherence evaluation for English texts

Takuya Fukumoto; Gou Tanaka

[4Xin2-48] Study of coherence evaluation for English texts

〇Takuya Fukumoto¹, Gou Tanaka¹ (1.NTT DOCOMO, INC.)

Keywords:AI, NLP, Automatic Scoring, Educational application, coherence

In this paper, we defined the task of coherence, which indicates the naturalness of logical development, and created a rubric to evaluate the quality of the text. Using the rubric, we created a dataset of essays written by English language learners that were manually evaluated by experts. The Fleiss' Kappa of the three experts' manual ratings was 0.17. We also conducted an automatic coherence evaluation using a specialized model and LLM. In the automatic evaluation, the method directly evaluated by GPT-4 was the closest to the manual coherence evaluation, with a Pearson's correlation coefficient of 0.381.The original method using Sentence Ordering outperformed the conventional MultiNLI model by using a specific score index.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[4Xin2] Poster session 2

[4Xin2-48] Study of coherence evaluation for English texts

Password