4:30 PM - 4:50 PM
[3P4-GS-2-04] Construction of Japanese BERT with Fixed Token Embeddings
[[Online]]
Keywords:Natural Language Processing, Word Embedding, BERT
In this paper, we propose to construct Japanese BERT by fixed token embedding in order to reduce the construction time of BERT. In particular, we propose to construct Japanese BERT by learning word embeddings in advance using word2vec, and then fixing the word Embeddings as the TokenEmbedding for BERT. In the experiments, we constructed 1024-dimensional 4-layer Japanese BERT using the conventional method and the proposed method, and verified the effectiveness of the proposed method by comparing the model construction time and the accuracy in the document classification task for Japanese news articles. The experimental results show that the proposed method reduces the construction time by 2.5%, improves the accuracy, and converges the accuracy in an early stage.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.