JSAI2024

Presentation information

Poster Session

Poster session » Poster session

[3Xin2] Poster session 1

Thu. May 30, 2024 11:00 AM - 12:40 PM Room X (Event hall 1)

[3Xin2-77] Japanese practical evaluation of large-scale language models: Comparative analysis using JGLUE and IT passport exam

〇Masashi Hachuda1 (1.GMO Media, Inc.)

Keywords:AI, LLM, IT

In this study, we evaluated the usefulness of large-scale language models (LLMs) in the IT domain using the IT Passport exam and the JGLUE, and examined how areas of expertise affect the accuracy of LLMs. Experimental results showed that certain types of LLMs can achieve a certain level of accuracy in the IT domain, but models like JGLUE, which show high accuracy on tasks that ask for general knowledge, tend to struggle with IT problems. However, we found that even LLMs with low IT skills could improve the accuracy of most models by providing hints as prior information in the prompts. Dependence on prompts also affected LLMs' performance, particularly models that faithfully answered prompt instructions correctly achieved slightly higher accuracy, while some models improved their accuracy despite less reliance on prompts. This suggests that there is not necessarily a direct relationship between reliance on prompts and inferential ability.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password