12:40 PM - 1:00 PM
[4E2-OS-7a-02] Extracting Important Sentences with Random Forest for Statute Summarization
Keywords:Outlines of Japanese Statutes, Automatic Summarization, Random Forest
Our purpose is to provide an automatic summarization for Japanese acts and we propose a sententence extraction method with Random Forest.
While the traditional automatic summarization methods have used the information of summarizing source data, in recent years, the methods based on machine learning use the summarization results.
However, in such a method, the amount of learning corpus is small, especially in Japanese text.
In this research, we solve this problem by using "Outlines of Japanese Statutes," which are official summaries of statutes published by the Japanese government.
Furthermore, we show that the sentence extraction method with Random Forest has higher performance rather than with decision trees or with support vector machines.
While the traditional automatic summarization methods have used the information of summarizing source data, in recent years, the methods based on machine learning use the summarization results.
However, in such a method, the amount of learning corpus is small, especially in Japanese text.
In this research, we solve this problem by using "Outlines of Japanese Statutes," which are official summaries of statutes published by the Japanese government.
Furthermore, we show that the sentence extraction method with Random Forest has higher performance rather than with decision trees or with support vector machines.