JSAI2022

Presentation information

General Session

General Session » GS-4 Web intelligence

[3M4-GS-4] Web intelligence: model

Thu. Jun 16, 2022 3:30 PM - 4:50 PM Room M (Room B-2)

座長:細川 晃(東芝)[遠隔]

4:30 PM - 4:50 PM

[3M4-GS-4-04] Fusion of Linguistic and Citation Information in Scientific Literature using Transformer Model

〇Masanao Ochi Ochi1, Masanori Shiro2, Jun'ichiro Mori1, Ichiro Sakata Sakata1 (1. The University of Tokyo, 2. National Institute of Advanced Industrial Science and Technology,)

[[Online]]

Keywords:Scholarly big data, Scientific research impact, SciBERT, GraphBERT

The Transformer model, released in 2017, was initially used in natural language processing but has since been widely used in various fields such as image processing and network science. The Transformer model can be used to publish trained models using large data sets and apply new data to individual tasks. Fine-tuning can be applied. The scientific literature contains a wide variety of data, including language, citations, and images of figures and tables. However, classification and regression studies have mainly been conducted by using each data individually and combining the extracted features, and the interaction between the data has not been fully considered. This paper proposes an end2end fusion method of linguistic and citation information of academic literature data using the Transformer model. The proposed method improves the F-measure by 2.6 to 6.0 points compared to using only individual information. This method makes it possible to fuse various data from the academic literature into end2end and shows the possibility of efficiently improving the accuracy of various classifications and predictions.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password