17:00 〜 17:20
[1U5-IS-2b-01] Comparing Feature Extraction Methods for Sarcasm Detection in Twitter
[[Online, Regular]]
キーワード:sarcasm detection, feature extraction, machine learning
Sarcasm detection is a challenging task, which identifies expressions that have the opposite meaning of what is written. Most previous works only measure sentiment polarity in sentences. However, more features are needed for improving the result. In this paper, we intend to compare different feature extraction methods including n-gram, sentiment, punctuation, and part of speech features for sarcasm detection. Firstly, sarcastic data are collected using Twitter API, and preprocessed by removing all the hashtags, mentions and URLs. Then, after all features were extracted, they are combined by One Hot Encoding. Finally, we use two classification methods: Support Vector Machine and Logistic Regression for comparison. In our experimental results, n-gram feature gives the best performance compared to the other individual features. Support Vector Machine gives a better performance than logistic regression with an F1-measure of 79.64%. This shows the potential of combining different features for sarcasm detection.
講演PDFパスワード認証
論文PDFの閲覧にはログインが必要です。参加登録者の方は「参加者用ログイン」画面からログインしてください。あるいは論文PDF閲覧用のパスワードを以下にご入力ください。