4:50 PM - 5:10 PM
[3Q5-IS-2b-05] User Interface Design using Masked Language Modeling in a Transformer Encoder-based Model
Keywords:Deep Learning, Masked Language modeling, UI Design, Transformers, Self-Supervised Learning
In this paper, we present an innovative exploration in the area of User Interface (UI) Layout Understanding, taking advantage of the strengths of transformer models and using self-supervised learning and curriculum learning, focusing primarily on the task of masked language modeling for UI Layout completion. The core challenge we face is the interpretation of UI design elements as tokens in a linguistic model that transforms the problem of traditional image completion into a form of masked language modeling. Our research uses the extensive RICO dataset, dealing with more than 66k UI screen images and 3M+ UI elements as tokens that are interpreted and processed in a linguistic structure. Using self-supervised learning, our model learns to predict missing UI elements in a sequence, imitating the masked language modeling process. This approach allows the transformer to develop an essential understanding of UI layouts without relying on labeled data. In addition, the model is trained through a learning strategy of the curriculum, gradually increasing in complexity, i.e. the percentage of masked tokens among all tokens. The implications of this work extend beyond UI design, suggesting novel applications of transformer models and self-supervised learning in areas where visual elements can be interpreted through linguistic models.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.