Multimodal Deep Model for POI Category Prediction using Linguistic and Image Information

Issei Sawada

2:10 PM - 2:30 PM

[2E4-GS-6-03] Multimodal Deep Model for POI Category Prediction using Linguistic and Image Information

〇Issei Sawada¹, Yusuke Okimoto², Kenta Kanamori², Itsuki Noda¹, Satoshi Oyama¹, Junji Saikawa² (1. Hokkaido University, 2. Yahoo! Japan)

Keywords:multimodal deep learning, user reviews

The accuracy of POI (Point of Interest) categories is becoming increasingly important since numerous users use services that rely on POI categories nowadays. Machine learning models are widely used to infer POI categories from various information. Recently, it has been reported that multimodal deep models show high performance in many tasks. In this paper, we propose a multimodal deep model for POI category prediction using both linguistic and image information. In order to use image information effectively, the proposed model (1) introduces a loss against prediction based only on linguistic information and (2) introduces pooling to input multiple images for each POI. Using Yahoo! Japan's POI database, we confirmed that the proposed method improves the performance of POI category prediction compared to the baseline that uses only linguistic or image information.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[2E4-GS-6] Language media processing

[2E4-GS-6-03] Multimodal Deep Model for POI Category Prediction using Linguistic and Image Information

Password