JSAI2021

Presentation information

Organized Session

Organized Session » OS-3

[1D2-OS-3a] ニュースメディアのデータサイエンス(1/3)

Tue. Jun 8, 2021 1:20 PM - 3:00 PM Room D (OS room 2)

座長:園田 亜斗夢(東京大学)

2:00 PM - 2:20 PM

[1D2-OS-3a-03] News Articles Summarization with MMR Sentence Selection and TF-IDF Sentence Compression

〇Shotaro Ishihara1, Norihiko Sawa1 (1. Nikkei Inc.)

Keywords:natural language processing, extractive summarization, MMR, TF-IDF

This paper proposes a method to summarize news articles by sentence selection and compression. We can extract N texts which represent the article, and enumerate summary candidates by compressing each text through syntactic analysis. MMR (Maximal Marginal Relevance) and TF-IDF (Term Frequency - Inverse Document Frequency) are used as metrics. Experiments showed that the proposed method was able to extract the same topics as the human editor's summary in the rate of 26%. Even though the rate wasn't high enough, most of the achievements couldn't be described as incorrect as one of the summary candidates. This methodology has a potential to reduce the burden on editors and generate some collaboration.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password