2024年度 人工知能学会全国大会(第38回)

講演情報

国際セッション

国際セッション » IS-2 Machine learning

[3Q5-IS-2b] Machine learning

2024年5月30日(木) 15:30 〜 17:10 Q会場 (402会議室)

座長:ジェプカ ラファウ(北海道大学)

16:50 〜 17:10

[3Q5-IS-2b-05] User Interface Design using Masked Language Modeling in a Transformer Encoder-based Model

〇Iskandar Salama1, Luiz Henrique Mormille1, Masayasu Atsumi1 (1. Soka University)

キーワード:Deep Learning, Masked Language modeling, UI Design, Transformers, Self-Supervised Learning

In this paper, we present an innovative exploration in the area of User Interface (UI) Layout Understanding, taking advantage of the strengths of transformer models and using self-supervised learning and curriculum learning, focusing primarily on the task of masked language modeling for UI Layout completion. The core challenge we face is the interpretation of UI design elements as tokens in a linguistic model that transforms the problem of traditional image completion into a form of masked language modeling. Our research uses the extensive RICO dataset, dealing with more than 66k UI screen images and 3M+ UI elements as tokens that are interpreted and processed in a linguistic structure. Using self-supervised learning, our model learns to predict missing UI elements in a sequence, imitating the masked language modeling process. This approach allows the transformer to develop an essential understanding of UI layouts without relying on labeled data. In addition, the model is trained through a learning strategy of the curriculum, gradually increasing in complexity, i.e. the percentage of masked tokens among all tokens. The implications of this work extend beyond UI design, suggesting novel applications of transformer models and self-supervised learning in areas where visual elements can be interpreted through linguistic models.

講演PDFパスワード認証
論文PDFの閲覧にはログインが必要です。参加登録者の方は「参加者用ログイン」画面からログインしてください。あるいは論文PDF閲覧用のパスワードを以下にご入力ください。

パスワード