JSAI2021

Presentation information

Organized Session

Organized Session » OS-3

[1D4-OS-3c] ニュースメディアのデータサイエンス(3/3)

Tue. Jun 8, 2021 5:20 PM - 6:20 PM Room D (OS room 2)

座長:高野 雅典(サイバーエージェント)

5:40 PM - 6:00 PM

[1D4-OS-3c-02] A Study of Abstract Summarization Method for Japanese News Articles Using BertSum

〇Keito Ishihara1, Shotaro Ishihara2, Hono Shirai2 (1. University of Tsukuba, 2. nikkei)

Keywords:NLP, Abstract summarization, BERT

In this study, we tackle abstract summarization of Japanese news articles using BERT, which is common in the field of natural language processing in recent years. Specifically, we use BertSum, a summarization method that is an extension of BERT. We trained BertSum using three types of BERT, and the experiment showed that Japanese pre-trained models worked better than multilingual model. There was no significant difference in the performance of the model pre-trained on Japanese news articles and Japanese Wikipedia. We also discussed tokenizers and unknown words, which are important in dealing with news articles in Japanese.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password