9:20 AM - 9:40 AM
[3M1-GS-10-02] Scandalous Article Classification with Contrastive Learning BERT and Study of Sentence Embedded Representation
Keywords:Deep Learning, Document Analysis, Scandals Article Classification
This research reports on an attempt to determine whether an economic article deals with a scandal or not, attributed to a binary classification problem. Since scandals can have a tremendous impact on the management of a company or entity, it is absolutely crucial to detect reported articles as early as possible, and overlooking them is absolutely unacceptable. This requires a high recall rate. In this study, we attempted to improve the recall rate by using a deep learning model called SimCSE, which is anisotropic in the sentence space of BERT, to suppress the oversight of scandalous articles. The results of experiments using Reuters articles showed that BERT with SimCSE applied improved the recall rate compared to BERT without SimCSE. Improvement was also observed in the index of sentence space uniformity, suggesting that this isotropic space contributed to the improvement in recall. The high level of uniformity was also found to be inherited before and after fine tuning.
Translated with www.DeepL.com/Translator (free version)
Translated with www.DeepL.com/Translator (free version)
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.