JSAI2020

Presentation information

General Session

General Session » J-13 AI application

[1D3-GS-13] AI application: Text mining and natural language

Tue. Jun 9, 2020 1:20 PM - 3:00 PM Room D (jsai2020online-4)

座長:梅原英一(東京都市大)

2:40 PM - 3:00 PM

[1D3-GS-13-05] Creation of a Japanese SDGs dataset and a baseline model of classification

〇XIN ZHANG1, YUSUKE MOTOKI2, YUYA SONEOKA1, YUSUKE IWASAWA1, YUTAKA MATSUO1 (1. Graduate School of Engineering The University of Tokyo, 2. Graduate School of Engineering Keio University )

Keywords:SDGs, NLP, Deep Leaning, AI, Classification

Natural language processing tasks targeting the SDGs (Sustainable Development Goals), which have started to influence social structures and corporate philosophy, have recently begun. Because of the lack of language resources, efforts in Japanese were difficult. In this study, we collected Japanese SDGs-related data from materials published by universities and created a data set. And the SDGs classification model was constructed. As the augmentation method, 1. a part-of-speech replacement using the BERT MASK model 2. A reverse translation method in which the English translation using Google transfer was translated into Japanese again was used. Classification was performed using a topic model (LDA etc.) which is a classical machine learning method and BERT etc. which is a deep learning model. The results show the results of the augmentation in the minority data task. Produces relatively high accuracy in a small number of data.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password