1:50 PM - 2:10 PM
[2P4-GS-11-02] Multiclass Web Page Classification for Visualizing Researchers' Activities
Keywords:Web Page Classification
It is usually difficult to find experts in a particular field as tracking the activities of many researchers is a time-consuming task. Although some efforts have been made to summarize researchers' work and interests using bibliographic records, other relevant information about their professional careers are scattered across various web pages. This paper introduces a multi-class classification method for researcher web pages to utilize such miscellaneous information for characterizing each
researcher. Our method uses two neural networks that are based on pre-trained embedding models to extract descriptive features from the URL and page text. We constructed a dataset of Japanese researchers and the experimental results demonstrate the effectiveness of our proposed method compared to baseline and conventional methods.
researcher. Our method uses two neural networks that are based on pre-trained embedding models to extract descriptive features from the URL and page text. We constructed a dataset of Japanese researchers and the experimental results demonstrate the effectiveness of our proposed method compared to baseline and conventional methods.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.