9:00 AM - 9:20 AM
[3M1-OS-12a-01] Analysis of QA Task and AV Task in NTCIR-17 QA Lab-PoliInfo-4
Keywords:Question Answering, Automatic Summarization, Factual Error
This study aims to analyze what errors occur in QA and summarization systems using large language models and whether such errors can be detected.
The data are the results of the Question Answering (QA) and Answer Verification (AV) tasks of the NTCIR-17 QA Lab-PoliInfo-4 using assembly minutes.
The QA task is a task to output a summary of the corresponding answer to the input question summary in the assembly minutes, and we analyzed the errors in the results.
The AV task is a task to judge whether the QA task's output is correct, and we analyzed what kind of output is misjudged.
The data are the results of the Question Answering (QA) and Answer Verification (AV) tasks of the NTCIR-17 QA Lab-PoliInfo-4 using assembly minutes.
The QA task is a task to output a summary of the corresponding answer to the input question summary in the assembly minutes, and we analyzed the errors in the results.
The AV task is a task to judge whether the QA task's output is correct, and we analyzed what kind of output is misjudged.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.