Presentation information

Oral presentation

General Session » [General Session] 9. NLP / IR

[2C1] [General Session] 9. NLP / IR

Wed. Jun 6, 2018 9:00 AM - 10:40 AM Room C (4F Orchid)

座長:宮西 大樹(国際電気通信基礎技術研究所)

10:20 AM - 10:40 AM

[2C1-05] A study on document classification focusing on the output side weight on Word2Vec

〇Shuto Uchida1, Tomohiro Yoshikawa1, Felix Jimenez1, Takeshi Furuhashi1 (1. Nagoya university)

Keywords:document classification, Word2Vec, distributed representation, ensemble learning, syn1neg

Document classification is an important technology in modern information society. In recent years, distributed representation (DR) which embeds semantic relationships of words into vectors has attracted attention and the methods applying DR to document classification have been reported. DR can be generated mainly by using a tool called Word2Vec. Word2Vec has the learning structure using a neural network, and we use the weights on the input side as DR. However, Word2Vec learns different characteristic weights on the output side from DR, which is not focused on and not commonly used. In this paper, we propose a document classification method by ensemble learning using DR and the output side weights and suggest the usefulness on the proposed method.