Transformer-Based Vector Font Classification with Patch Embedding

Takumu Fujioka; Gouhei Tanaka

[3Win5-25] Transformer-Based Vector Font Classification with Patch Embedding

〇Takumu Fujioka¹, Gouhei Tanaka^1,2 (1.Nagoya Institute of Technology, 2.The University of Tokyo)

Keywords:Transformer, Vector font

The fonts used in modern publications, websites, and digital media adopt a vector format, which allows for scaling without loss of image quality. However, most deep learning methods for tasks such as font generation, transformation, and classification have focused on bitmap representations, and research on deep learning for vector fonts remains relatively underexplored. In this study, we propose a method using patch embeddings for Transformer-based vector font classification. We demonstrate through numerical experiments that patch embeddings enhance performance and stabilize training. The shape of vector font characters is represented as a sequence of drawing commands, and existing methods treat each drawing command as an individual token. Our proposed approach is analogous to tokenization in language models or patch partitioning in bitmap-based image classification models.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[3Win5] Poster session 3

[3Win5-25] Transformer-Based Vector Font Classification with Patch Embedding

Password