JSAI2024

Presentation information

General Session

General Session » GS-7 Vision, speech media processing

[2C1-GS-7] Language media processing:

Wed. May 29, 2024 9:00 AM - 10:40 AM Room C (Temporary room 1)

座長:西澤直樹((株)東芝)

10:20 AM - 10:40 AM

[2C1-GS-7-05] Towards General Text-to-Design System with LLM-based Agents

〇Hisaki Seki1, Kotaro Kikuchi2, Naoto Inoue2, Mayu Otani2, Kota Yamaguchi2, Edgar Simo-Serra1 (1. Waseda University, 2. CyberAgent)

Keywords:LLM-based Agents, Text-to-Design, User Interface, Graphic Design

Design tasks, including user interface design, can be complex and time-consuming for non-designers. There are several methods for generating designs, including creating pixel images or training models to output designs in specific formats from natural language queries. However, there is a need for a more general approach to performing design tasks that does not rely on a specific format and produces editable results. The proposed method, DesignPlanner, is a system that can execute design tasks using a large language model. Utilizing the existing methodology as the core, the system consists of two components: Planner and Executor. Planner breaks down the query into subqueries, and Executor executes the subqueries using registered functions. Our system performs simple operations related to Web UI design, such as using existing components and editing them. Our prototype evaluation demonstrates that our approach is useful for simple tasks, but can fail more complex tasks due to incorrect function calls.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password