JSAI2018

Presentation information

Oral presentation

General Session » [General Session] 9. NLP / IR

[2L1] [General Session] 9. NLP / IR

Wed. Jun 6, 2018 9:00 AM - 10:40 AM Room L (3F Sapphire Hall Asuka)

座長:柳瀬 利彦(株式会社 日立製作所)

10:00 AM - 10:20 AM

[2L1-04] Mathematical Expression Retrieval in PDF documents from Web using Mathematical Terms as Queries

〇Kuniko Yamada1, Harumi Murakami1 (1. Osaka City University)

Keywords:mathematical information retrieval, mathematical expression image, web search

Since mathematical expressions on the web are not annotated with natural language, searching for expressions by conventional search engines is difficult. Our proposed method performs web searches using a mathematical term as a query and extracts expressions related to it from the obtained PDF documents. We convert PDF to TeX, create images from the mathematical descriptions in TeX, and obtain image feature quantities. The expressions related to the query are discriminated by SVM using the feature quantities. Our experimental results showed that MRR's performance is the best when using both PDF and HTML.