JSAI2019

Presentation information

General Session

General Session » [GS] J-9 Natural language processing, information retrieval

[2I5-J-9] Natural language processing, information retrieval: classification and evaluation

Wed. Jun 5, 2019 5:20 PM - 7:00 PM Room I (306+307 Small meeting rooms)

Chair:Hirotoshi Taira Reviewer:Kugatsu Sadamitsu

6:00 PM - 6:20 PM

[2I5-J-9-03] A Study of Patent Publications Classification Using Machine Translation and Rough Set Theory

〇Masaki Kurematsu1 (1. Iwate Prefectural University)

Keywords:Document Classification, Patent, Rough Set Theory, Machine Translation, Naive Bayes Classification

It is important to check exists patents before submitting own patents or sailing new products. However, it is hard task to check a lot of patents. In order to support this task, I proposed a framework of a patent publication classification system using machine translation and Rough set theory in this paper. It makes a classifier from patent publications labeled by experts with the following 4 steps. In step.1, this framework extracts sentences from abstracts of patents based on block tags. In step.2, it translates these sentences to English using Machine translation and extracts terms using Term Frequency and Rough Set reduction. In step.3, it makes a Document Term Matrix form extracted terms. In step.4, it makes a Naive Bayes Classifier and Rough set rules from a Document Term Matrix as classifier. It classifies unlabeled patent publications by these classifiers. I developed this framework by R language and some natural language processing tools and evaluated. In evaluation, I tried to classify some patent publications with an expert. Experimental results show the possibility of this approach.