JSAI2023

Presentation information

General Session

General Session » GS-5 Language media processing

[3A1-GS-6] Language media processing

Thu. Jun 8, 2023 9:00 AM - 10:40 AM Room A (Main hall)

座長:是枝 祐太(日立製作所) [現地]

10:00 AM - 10:20 AM

[3A1-GS-6-04] Countermeasures against inappropriate labels using Active Learning in Recognizing Textual Entailment

〇Ai Matsuho1, Hitoshi Iyatomi1 (1. Hosei University)

Keywords:RTE, NLP, Active Learning

Recognition textual entailment (RTE) is an important technology but a research challenge due to the large number of inappropriate training labels in the data set. In this report, we propose Active Clean, which uses active learning (AL) to detect inappropriate labels. The method improves performance by manually assigning correct labels to the selected small amount of data, and then repeating the re-training process. A sampling survey of the JSNLI dataset used in this study showed that about 10% of the labels were incorrect. These mislabeled data were examined using Active Clean and the majority of them were estimated to be inappropriate. The RTE model built by excluding these from the training data improved the average prediction performance by 7.8% compared to the regular training model for test data with confirmed correct labels. This indicates that Active Clean is effective in identifying with many inappropriate labels and has the potential to build more robust models.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password