JSAI2025

Presentation information

Organized Session

Organized Session » OS-1

[1P4-OS-1b] OS-1

Tue. May 27, 2025 3:40 PM - 5:20 PM Room P (Room 801-2)

オーガナイザ:鈴木 健二(ソニーグループ),原 聡(電気通信大学),谷中 瞳(東京大学),菅原 朔(国立情報学研究所)

4:20 PM - 4:40 PM

[1P4-OS-1b-03] Study of an evaluation based on a method of extracting key points for domestic agricultural LLMs.

〇Junichi Ishihara1, Akio Kobayashi1, Tetsuo Katsuragi1, Masahiro Otomo1, Akira Hashimoto2, Kotaro Sakamoto3, Atomu Sugimura4, Junichi Yonemaru1, Takahiro Kawamura1 (1. National Agriculture and Food Research Organization, 2. Tsukuba University, 3. BESNA Institute Inc., 4. Agricultute Research Institute in Mie Prefecture)

Keywords:Large Language Models, Agriculture Information

In this study, instructional data were constructed based on manuals on strawberries provided by the Mie Prefectural Agricultural Research Institute, and instructional tuning was applied to the Elyza-8B model using this data. As this system is domain-specific, it is desirable if the answers cover the expertise and key points to be addressed. The authors have therefore proposed an automatic evaluation method using LLM as a Judge, which is based on the extraction of a list of key points using predicate term structure analysis with LLM and the recognition of their implication relationship with the correct answer data. However, this method had a problem in that it could not recognize the implication relations appropriately when the model outputs were more specific key points. In this paper, we discuss the rigorous performance evaluation and improvement of the evaluation method for automatic evaluation using the newly constructed manual key extraction data.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password