JSAI2022

Presentation information

General Session

General Session » GS-5 Language media processing

[1P4-GS-6] Language media processing: basic theory

Tue. Jun 14, 2022 2:20 PM - 4:00 PM Room P (Online P)

座長:岡嶋 穣(NEC)[遠隔]

3:40 PM - 4:00 PM

[1P4-GS-6-05] Sequence-to-Sequence Document Revision Models Using Switching Tokens to Handle Multiple Perspectives Simultaneously

〇Mana Ihori1, Hiroshi Sato1, Tomohiro Tanaka1, Ryo Masumura1 (1. NTT)

[[Online]]

Keywords:document revision, switching token, matched and partially-matched task

This paper defines the document revision task and proposes a novel modeling method. In this task, we aim to simultaneously consider multiple perspectives for writing supports. To this end, it is important not only to correct grammatical errors but also to improve readability and perspicuity; however, it is difficult to prepare enough matched dataset that handles multiple perspectives simultaneously. To mitigate this problem, our idea is to utilize not only a limited matched dataset but also various partially-matched datasets that handles individual perspectives. Since suitable partially-matched datasets have either been published or can easily be made, we expect to prepare a large amount of these partially-matched datasets. To effectively utilize these datasets, our proposed modeling method incorporates ``on-off'' switches into sequence-to-sequence model to distinguish the matched and individual partially-matched datasets. Experiments using the document revision dataset demonstrate the effectiveness of the proposed method.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password