JSAI2021

Presentation information

General Session

General Session » GS-5 Language media processing

[4J3-GS-6f] 言語メディア処理:データセットとその利用

Fri. Jun 11, 2021 1:40 PM - 3:20 PM Room J (GS room 5)

座長:亀甲 博貴(京都大学)

2:20 PM - 2:40 PM

[4J3-GS-6f-03] Developing and Evaluating a Context Dataset for How-to Tip Machine Reading Comprehension

〇Shuting Bai1, Tingxuan Li1, Seiji Suzuki1, Takehito Utsuro1, Yasuhide Kawada2 (1. University of Tsukuba, 2. Logworks Co., Ltd.)

Keywords:question answering, machine comprehension, tip, BERT, context

In this paper, we focus on the task of how-to tip machine reading
comprehension (MRC), which is in the field of non-factoid MRC. Then, in
the field of how-to tip MRC, we propose a method to build a context
dataset, to which we apply a certain procedure of retrieving candidates
of context paragraphs that are supposed to include candidates of answers
to the given question. The information source of the context dataset is
the column pages collected from how-to tip Web sites. We show that it is
easy to develop a context dataset consisting of more than a few thousand
context paragraphs. Then, we propose a procedure to combine a search
module based on TF-IDF and a BERT machine reading comprehension model
that is evaluated based on the context dataset developed in this paper.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password