JSAI2021

Presentation information

General Session

General Session » GS-5 Language media processing

[4J1-GS-6d] 言語メディア処理:自然言語処理(1/2)

Fri. Jun 11, 2021 9:00 AM - 10:40 AM Room J (GS room 5)

座長:川野 陽慈(慶應義塾大学)

9:20 AM - 9:40 AM

[4J1-GS-6d-02] Knowledge Distillation of Japanese Morphological Analyzer

〇Sora Tagami1, Daisuke Bekki1 (1. Ochanomizu University)

Keywords:Natural Language Processing, Morphological Analysis, Deep Learning

In this study, we apply the method of knowledge distillation to the Japanese morphological analyzerrakkyoand evaluate if the method compresses its model size, and the training converges for smaller datasets. Recently,Japanese morphological analyzers have achieved high performance in both accuracy and speed. From the viewpointof practical uses, however, it is preferable to reduce the model size. The rakkyo model, among others, succeeded insignificantly reducing its model size by using only character unigrams and discard the dictionary, by the training onsilver data of 500 million sentences generated by Juman++. We tried to further compress rakkyo by constructinga neural morphological analyzer for Japanese using the outputs of rakkyo, namely the probabilistic distributions astraining data. The evaluation is done against the silver data generated by rakkyo, which suggests that our modelapproaches the accuracy of rakkyo with a smaller amount of data.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password