JSAI2024

Presentation information

General Session

General Session » GS-2 Machine learning

[2B1-GS-2] Machine learning: Text mining

Wed. May 29, 2024 9:00 AM - 10:40 AM Room B (Concert hall)

座長:坂地 泰紀(北海道大学)

9:20 AM - 9:40 AM

[2B1-GS-2-02] Hierarchical Multi-label Classification Model Adapted to Training Data with Different Layers of Correct Labels

〇Kengo Miyajima1, Yuto Nunome1, Yuta Sakai1, Masayuki Goto1 (1. Waseda University)

Keywords:Hierarchical Multi-label Classification, Embedding, Box Embedding, BERT, Text Classification

Multi-label classification in document data is the task of correctly assigning multiple class labels to each document. However, there is often a semantic hierarchical structure among the assigned labels, and considering the hierarchical structure can improve the accuracy of label prediction. The Multi-label Box Model (MBM) has been proposed as a multi-label classification model that takes into account the semantic hierarchical structure among labels, and its effectiveness has been demonstrated when class labels of all layers are assigned to training data. However, real-world document data posted on user-contributed websites often lack class labels for all layers of the hierarchy. If such data is used to train MBM, the accuracy of label prediction is reduced. In this study, we propose a framework for learning MBM after complementing labels of missing hierarchies by introducing Bidirectional Encoder Representations from Transformers (BERT). The effectiveness of the proposed method is also demonstrated through evaluation experiments, which compare the accuracy of the conventional method and the proposed method when applied to data with missing labels of some hierarchies.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password