JSAI2024

Presentation information

General Session

General Session » GS-5 Agents

[2F4-GS-5] Agents:

Wed. May 29, 2024 1:30 PM - 3:10 PM Room F (Temporary room 4)

座長:上野 史(岡山大学)[[オンライン]]

2:10 PM - 2:30 PM

[2F4-GS-5-03] Layout Generation Agents with Large Language Models

〇Yuichi Sasazawa1, Yasuhiro Sogawa1 (1. Hitachi, Ltd.)

Keywords:Layout Generation, Large Language Model, Multimodal, Agent

In recent years, there has been an increasing demand for customizable 3D virtual spaces. Due to the significant human effort required to create these spaces, there is a need for efficiency in virtual space creation. While existing studies have proposed methods for automatically generating layouts such as floor plans and furniture arrangements, these methods only generate text outlining the layout structure based on input instructions, without utilizing the information obtained during the generation process. In this study, we propose an agent-driven layout generation system using the GPT-4V multimodal large-scale language model and validate its effectiveness. Specifically, the language model manipulates agents to sequentially place objects in the virtual space, thus generating layouts that reflect user instructions. Experimental results confirm that our proposed method can generate virtual spaces reflecting user instructions with a high success rate. Additionally, we successfully identified elements contributing to the improvement in behavior generation performance through ablation testing.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password