4:30 PM - 4:50 PM
[2I5-GS-10-04] Proposal of a Question-Answering System Using RAG with Cost Reduction Techniques
Keywords:Retrieval-Augmented Generation, ChatGPT
Retrieval-Augmented Generation (RAG) is a technique that enables question-answering for internal organizational documents by integrating external information with large-scale language models. In recent years, there has been a growing trend in question-answering services that combine ChatGPT with RAG. However, using high-performance models like GPT-4 in large-scale settings can lead to increased API costs due to the rising number of input tokens. This study proposes an additional step that utilizes lower-cost models, such as GPT-3.5, to selectively extract only the necessary information from documents before generating responses. This approach aims to reduce the number of tokens used during response generation, thereby potentially lowering the operational costs associated with GPT-4. The paper also compares the results of this proposed method with those of conventional methods to assess its effectiveness. The findings indicate that the proposed method manages to reduce costs while maintaining accuracy.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.