JSAI2024

Presentation information

General Session

General Session » GS-5 Language media processing

[4A1-GS-6] Language media processing:

Fri. May 31, 2024 9:00 AM - 10:40 AM Room A (Main hall)

座長:田中 駿(JX通信社)

10:20 AM - 10:40 AM

[4A1-GS-6-05] Search Query Expansion Method for Patent Documents Combining Large Language Models and Thesaurus

〇Kaede Mori1,2, Hirofumi Nonaka3, Asahi Hentona4, Seiya Kawano5, Koichiro Yoshino5, Koji Marusaki1, Shotaro Kataoka1,6 (1. Nagaoka University of Technology, 2. Kikagaku, Inc., 3. Aichi Institute of Technology, 4. CyberAgent, 5. Guardian Robot Project, RIKEN, 6. MayoLab Co., Ltd.)

Keywords:Patent Retrieval, LLM, NLP, query expansion

Patent retrieval refers to the process of searching within patent databases for information on technologies, inventions, inventors, and applicants. Particularly, since recognized patent infringement in court could result in substantial damages or licensing fees, conducting thorough prior art searches is crucial. However, patent documents are composed of unique vocabularies and the number of documents is vast, making the research process costly. While there are several methods aimed at conducting exhaustive searches by expanding search queries, they generally struggle to address complex vocabularies present in only a small number of patents.Therefore, this study proposes query expansion combining thesauruses and large language models (LLMs). It focuses on the output tendencies of LLMs and the independence and co-occurrence rates of new words generated by existing thesauruses and LLMs, conducting a foundational analysis of the method. As a result, new words generated by large language models had low co-occurrence with existing thesauruses. The success in generating new vocabularies through large language models suggests the potential for comprehensive patent searches that can accommodate the unique vocabularies and complex expressions of patent documents.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password