JSAI2020

Presentation information

Organized Session

Organized Session » OS-8

[4P3-OS-8] OS-8

Fri. Jun 12, 2020 2:00 PM - 3:40 PM Room P (jsai2020online-16)

加藤 恒昭(東京大学)、外山 勝彦(名古屋大学)、森 信介(京都大学)

3:20 PM - 3:40 PM

[4P3-OS-8-05] Japanese Legal Term Correction using BERT Pretrained Model

〇Takahiro Yamakoshi1, Takahiro Komamizu1, Yasuhiro Ogawa1, Katsuhiko Toyama1 (1. Nagoya University)

Keywords:Legal term correction, BERT, Pretrained model

Legal documents contain legal terms that have similar meaning or pronunciation each other. Japanese legislation defines their usage on the basis of traditional customs and rules. In accordance with the definition, we need to use these legal terms properly and strictly in a statute. We are also encouraged to follow the definition in writing broad-sense legal documents, such as contracts and terms of use. To assist in writing legal documents, we propose a method that locates inappropriate legal terms in Japanese statutory sentences and suggests corrections. We solve this task with a classifier by regarding the task as a sentence completion test. Our classifier is based on a pretrained BERT model trained by using a large amount of general sentences. To raise performance, we apply three training techniques: domain adaptation, undersampling, classifier unification. Our experiments show that our classifier achieved better performance than Random Forest-based ones and language model-based ones.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password