
Presentation information

General Session

General Session » GS-10 AI application

[3F1-GS-10] AI application: Language model

Thu. May 30, 2024 9:00 AM - 10:40 AM Room F (Temporary room 4)

座長:水本 智也(LINEヤフー/SB Intuitions)

10:20 AM - 10:40 AM

[3F1-GS-10-05] Towards Automatic Generation of Graphic Layout with Large Multimodal Models

〇Limin Wang1, Satoshi Waki1, Toyotaro Suzumura1 (1. The University of Tokyo)

Keywords:Graphic Design, Graphic Layout, Large Multimodal Models

Given the recent advancement of generative models, it has become possible that AI instead of humans generates graphic layouts. Among existing methods for layout generation, some utilize not only the information of each element but also constraints such as the relationships between elements. However, these methods often require humans to specify the constraints, which can be burdensome. Additionally, they have the limitation of only considering the category information of layout elements like “image”, “text”, “title”, and so on, without taking into account the detailed content within those images or text. Thus, this study proposes a method that leverages the detailed content of elements and automatically generate constraints that will be used for layout generation. Since elements can be either images or text, we explore the use of large multimodal models for extracting detailed content. This approach leads to the automatic generation of graphic layouts with less need for extensive human input.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.
