4:20 PM - 4:40 PM
[1P4-OS-1b-03] Study of an evaluation based on a method of extracting key points for domestic agricultural LLMs.
Keywords:Large Language Models, Agriculture Information
In this study, instructional data were constructed based on manuals on strawberries provided by the Mie Prefectural Agricultural Research Institute, and instructional tuning was applied to the Elyza-8B model using this data. As this system is domain-specific, it is desirable if the answers cover the expertise and key points to be addressed. The authors have therefore proposed an automatic evaluation method using LLM as a Judge, which is based on the extraction of a list of key points using predicate term structure analysis with LLM and the recognition of their implication relationship with the correct answer data. However, this method had a problem in that it could not recognize the implication relations appropriately when the model outputs were more specific key points. In this paper, we discuss the rigorous performance evaluation and improvement of the evaluation method for automatic evaluation using the newly constructed manual key extraction data.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.