Evaluation of Using Large Language Models for Improving the Efficiency of Survey Research

Masahiro Honda; Takehiro Takayanagi; Takahiro Hoshino

[2Win5-91] Evaluation of Using Large Language Models for Improving the Efficiency of Survey Research

〇Masahiro Honda^1,3, Takehiro Takayanagi², Takahiro Hoshino^1,3 (1.Keio University, 2.Univ. of Tokyo, 3.RIKEN AIP)

Keywords:LLMs, Survey Responses Generation, LLMs Evaluation, Factor Analysis, Persona Generation

This study examined the capabilities of large language models (LLMs) to replicate the responses of actual survey participants. The results indicate that when demographic information and previous survey data are used as inputs for data generation, LLMs demonstrate a high degree of similarity in traditional evaluation metrics, such as the means of responses and accuracy rates. However, they struggled with newly proposed tasks designed to capture latent variables that are common across questions (e.g., consumer psychology), which reflect the underlying factor structure. We found that data generated using personas and existing datasets performed better across all evaluation criteria. In conclusion, this study introduced a novel method for approximating survey responses through synthetic data generation and proposed new evaluation metrics. These findings suggest that LLMs have significant potential for capturing the latent factors that underlie generated responses.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[2Win5] Poster session 2

[2Win5-91] Evaluation of Using Large Language Models for Improving the Efficiency of Survey Research

Password