JSAI2018

Presentation information

Oral presentation

General Session » [General Session] 9. NLP / IR

[4G2] [General Session] 9. NLP / IR

Fri. Jun 8, 2018 2:00 PM - 3:20 PM Room G (5F Ruby Hall Hiten)

座長:竹内 誉羽(HRI)

3:00 PM - 3:20 PM

[4G2-04] Semi-supervised Sentiment Classification with Dialog Data

〇Toru Shimizu1, Hayato Kobayashi1,2, Nobuyuki Shimizu1 (1. Yahoo Japan Corporation, 2. RIKEN AIP)

Keywords:Sentiment Analysis, Semi-supervised Learning, Dialog

The huge cost of creating labeled training data is a common problem for supervised learning tasks such as sentiment classification. Recent studies showed that pretraining with unlabeled data via a language model can improve the performance of classification models. In this paper, we take the concept a step further by using a conditional language model, instead of a language model. Specifically, we address a sentiment classification task for a tweet analysis service as a case study and propose a pretraining strategy with unlabeled dialog data (tweet-reply pairs) via an encoder-decoder model. Experimental results show that our strategy can improve the performance of sentiment classifiers and outperform several state-of-the-art strategies including language model pretraining.