JSAI2025

Presentation information

General Session

General Session » GS-10 AI application

[3A6-GS-10] AI application:

Thu. May 29, 2025 5:40 PM - 7:20 PM Room A (Large hall)

座長:林 兵馬(神戸大学附属中等教育学校)

6:00 PM - 6:20 PM

[3A6-GS-10-02] Performance Evaluation of Multimodal LLM with Life Insurance Business Data

〇Tamao Shimizu1, Hibiki Bannai1, Yoshiaki Onishi1 (1. The Dai-ichi Life Techno Cross Co., Ltd.)

Keywords:Industrial Application

In order to apply multimodal LLM to a life insurance company's inquiry response task, we constructed a benchmark using business data to compare and evaluate the actual performance of multiple models. We evaluated three models, Claude 3.5 Sonnet, Gemini 1.5 Pro, and GPT-4o, focusing on document QA and textualization of image content tasks. As a result, Claude 3.5 Sonnet showed the highest accuracy in document QA, and Gemini 1.5 Pro showed the highest accuracy in the image content text conversion task. In addition, we identified the characteristics of charts and tables in in-house documents that were difficult for LLM to recognize. Through these evaluations, we confirmed that benchmarking using business data yields results that are different from those obtained by general-purpose benchmarks that are publicly available.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password