JSAI2019

Presentation information

General Session

General Session » [GS] J-9 Natural language processing, information retrieval

[1N4-J-9] Natural language processing, information retrieval: domain knowledge analysis

Tue. Jun 4, 2019 5:20 PM - 6:40 PM Room N (Front-right room of 1F Exhibition hall)

Chair:Tomoko Okuma Reviewer:Kugatsu Sadamitsu

6:20 PM - 6:40 PM

[1N4-J-9-04] Analysis of vocabulary and ommitted words in car license tests

〇Seiki Matoba1, Masaki Koga1, Motohiro Otuka1, Ichirou Kobayashi2, Hirotoshi Taira1 (1. Faculty of Information Science and Technology, Osaka Institute of Technology, 2. Graduate School of Humanities and Sciences, Ochanomizu University)

Keywords:AI, car licence test, automatic problem solver

We develop a solver for Japanese car license tests.

The test consists of about a hundred of true/false questions about traffic rules,

driving manners, architectures of cars and the laws of physics related to cars.

While the passing score is 90\%, The best score in the previous approaches is about 65\%.

The approach is based on the sentence similarity

between the test sentence and most similar sentence

with the gold-standard answer in the database in the solver.

Toward the system to pass the test,

we analyzed the vocabulary and writing styles of the tests.

The results of the analysis showed that the vocabulary is relatively small, which is about 300 words for 100 problems,

and the sentences contain a lot of zero pronouns and they cause the low accuracy of the solver.

Furthermore, we tried to resolve the antecedents using a previous anaphora resolution system.

The results showed that the system cannot resolve the anaphora in the tests,

because each problem consists of only one sentence and the clue to resolve the pronoun is very few,

and they are more difficult to resolve than ones in standard articles.

The analysis has revealed that high-performance systems require the anaphora resolution

which is more based on domain specific knowledge.