JSAI2020

Presentation information

General Session

General Session » J-9 Natural language processing, information retrieval

[1D5-GS-9] Natural language processing, information retrieval: Models and meaning acquisition

Tue. Jun 9, 2020 5:20 PM - 7:00 PM Room D (jsai2020online-4)

座長:若木裕美(ソニー)

6:00 PM - 6:20 PM

[1D5-GS-9-03] A method of learning word embeddings considering Japanese character types

〇Satoshi Hirade1, Eiichi Tanaka1, Takeshi Onishi1 (1. Fuji Xerox, Co., Ltd.)

Keywords:Word Embeddings, Japanese, Character type

In this paper, we present a novel method for learning word embeddings. However, several word embedding approaches with extracting subwords from target word have been proposed, those methods have the problem of leaving subwords without meaning associated with the target word. These subwords have negative effects on obtaining better performance of word embeddings. To solve this problem, we adopted switching subword extraction rules based on Japanese character types. With this contrivance, the appearence of the subwords are surpressed. As a result, our method achieved better results on word similarity task than Word2Vec and FastText.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password