JSAI2021

Presentation information

General Session

General Session » GS-5 Language media processing

[4J2-GS-6e] 言語メディア処理:自然言語処理(2/2)

Fri. Jun 11, 2021 11:00 AM - 12:40 PM Room J (GS room 5)

座長:國吉 房貴(産業技術総合研究所)

11:00 AM - 11:20 AM

[4J2-GS-6e-01] Temporal Expression Classification Based on Data Labelling with Word Alignment in Japanese-English Parallel Corpus

〇Kazutaka Kinugawa1, Hitoshi Ito1, Hideya Mino1, Isao Goto1, Ichiro Yamada1 (1. NHK Science and Technology Research Laboratories)

Keywords:Natural Language Processing, Temporal Expression, Machine Translation

Temporal expression recognition is a long-standing problem in natural language processing (NLP). One difficulty of this task is to disambiguate specific temporal expressions which change the meanings depending on their contexts. Especially in Japanese news domain, this is an essential issue since these temporal expressions frequently occur and consequently mislead NLP systems. One of the effective approaches to tackle this problem is to build a supervised classification model, but a huge cost is required to prepare an enough amount of labeled training data. In this paper, we present an automatic data labelling method for such a Japanese specific temporal term. We leverage word alignment in Japanse-English parallel corpus and resolve their ambiguities based on both Japanese and English side information. We efficiently build a dataset and conduct a manual inspection against this dataset to confirm the efficacy of our technique. We train several baseline models on this dataset and obtain consistent performance.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password