Presentation information

Organized Session

Organized Session » OS-8

[4P3-OS-8] OS-8

Fri. Jun 12, 2020 2:00 PM - 3:40 PM Room P (jsai2020online-16)

加藤 恒昭(東京大学)、外山 勝彦(名古屋大学)、森 信介(京都大学)

2:20 PM - 2:40 PM

[4P3-OS-8-02] Multi-label text classification for risk prediction in contracts

〇Mina Fujii1, Tomohiko Abe1, Koji Takahashi1, Yasuhiro Iwaki1, Tsuneaki Kato2 (1. GVA TECH K.K., 2. Graduate School of Arts and Sciences, The University of Tokyo)

Keywords:Machine Learning, Natural Language Processing, LegalTech, Multiclass Classification, Multilabel Classification

To determine valid criteria in detecting risks of contracts is essential for automation of legal tasks such as reviewing contracts. In this paper, we propose a multi-label text classification with a neural network model in order to predict multiple review points in each clause of contracts. On our dataset consisting of over 20k Japanese contracts, in which each clause has 1 ~ 4 label(s) and the labels total 205, our model achieved 31 ~ 64 % accuracy, depending on the number of labels an input text contains, for test data. In addition, we observed probability transition from the first character to the last character of the input texts, character by character, to check the relation between input token and output labels, and we found out that this observation helps us to see where on input texts our model attends to predict labels.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.