JSAI2023

Presentation information

General Session

General Session » GS-5 Language media processing

[3T1-GS-6] Language media processing

Thu. Jun 8, 2023 9:00 AM - 10:40 AM Room T (Online)

座長:梶原 智之(愛媛大学) [現地]

9:40 AM - 10:00 AM

[3T1-GS-6-03] Investigation on Accuracy Improvement of Emotion Classification Based on Text Data Augmentation

〇Haruto Uda1, Kazuyuki Matsumoto1, Minoru Yoshida1, Kenji Kita1 (1. Tokushima university)

[[Online]]

Keywords:Data Augumentation, Natural Language Processing

Recently, SNSs have facilitated the collection of a wide variety of text data. However, SNS text data has problems such as short sentences with abbreviations and colloquial expressions, which make labeling difficult, and the difficulty of collecting a large amount of data in a short period of time. To solve this problem, data expansion is an effective method for efficiently preparing large-scale, high-quality labeled text data for machine learning. In this research, we aim to improve the learning accuracy of sentiment classification by extending the data to Japanese texts. EDA was used as the data expansion method. In particular, the use of various models for text manipulation in the EDA increased the range of data expansion. The expanded text generated by data expansion was evaluated based on the semantic similarity and the degree of textual change. The optimal data for training was selected by determining a threshold value. The WRIME corpus was used as the dataset to ensure the reliability of the labels. In this presentation, we report the results of learning accuracy in sentiment classification using data expansion.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password