JSAI2023

Presentation information

General Session

General Session » GS-5 Language media processing

[4A3-GS-6] Language media processing

Fri. Jun 9, 2023 2:00 PM - 3:40 PM Room A (Main hall)

座長:庵 愛(NTT) [現地]

2:40 PM - 3:00 PM

[4A3-GS-6-03] Text retrieval with multi-stage reranking model

〇Yuichi Sasazawa1, Kenichi Yokote1, Osamu Imaichi1, Yasuhiro Sogawa1 (1. Hitachi, Ltd. Research & Development Group)

Keywords:text retrieval, Natural Language Processing, pre-trained language model

The text retrieval is the task of retrieving similar documents to a search query, and it is important to improve retrieval accuracy while maintaining a certain level of retrieval speed. One of text retrieval methods is the re-ranking model using a language model. However, increasing the number of model parameters or use model ensembles to improve the accuracy results in a delay of retrieval speed. In order to improve accuracy while minimizing the delay in retrieval, we propose multi-stage text retrieval model using a highly accurate language model. We ranked the documents by BM25 and language models, and then re-ranks by a model ensemble or a larger language model for documents with high similarity to the query. In our experiments, we train the MiniLM language model on the MS-MARCO dataset and evaluate it in a zero-shot setting. Our proposed method achieves higher retrieval accuracy while reducing the retrieval speed decay.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password