JSAI2022

Presentation information

General Session

General Session » GS-2 Machine learning

[3P4-GS-2] Machine learning: NLP

Thu. Jun 16, 2022 3:30 PM - 4:50 PM Room P (Online P)

座長:小林 一郎(お茶の水女子大学)[現地]

4:30 PM - 4:50 PM

[3P4-GS-2-04] Construction of Japanese BERT with Fixed Token Embeddings

〇Arata Suganami1, Hiroyuki Shinnou1 (1. Univ. of Ibaraki)

[[Online]]

Keywords:Natural Language Processing, Word Embedding, BERT

In this paper, we propose to construct Japanese BERT by fixed token embedding in order to reduce the construction time of BERT. In particular, we propose to construct Japanese BERT by learning word embeddings in advance using word2vec, and then fixing the word Embeddings as the TokenEmbedding for BERT. In the experiments, we constructed 1024-dimensional 4-layer Japanese BERT using the conventional method and the proposed method, and verified the effectiveness of the proposed method by comparing the model construction time and the accuracy in the document classification task for Japanese news articles. The experimental results show that the proposed method reduces the construction time by 2.5%, improves the accuracy, and converges the accuracy in an early stage.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password