10:20 AM - 10:40 AM
[4A1-GS-6-05] Search Query Expansion Method for Patent Documents Combining Large Language Models and Thesaurus
Keywords:Patent Retrieval, LLM, NLP, query expansion
Patent retrieval refers to the process of searching within patent databases for information on technologies, inventions, inventors, and applicants. Particularly, since recognized patent infringement in court could result in substantial damages or licensing fees, conducting thorough prior art searches is crucial. However, patent documents are composed of unique vocabularies and the number of documents is vast, making the research process costly. While there are several methods aimed at conducting exhaustive searches by expanding search queries, they generally struggle to address complex vocabularies present in only a small number of patents.Therefore, this study proposes query expansion combining thesauruses and large language models (LLMs). It focuses on the output tendencies of LLMs and the independence and co-occurrence rates of new words generated by existing thesauruses and LLMs, conducting a foundational analysis of the method. As a result, new words generated by large language models had low co-occurrence with existing thesauruses. The success in generating new vocabularies through large language models suggests the potential for comprehensive patent searches that can accommodate the unique vocabularies and complex expressions of patent documents.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.