JSAI2023

Presentation information

General Session

General Session » GS-7 Vision, speech media processing

[1O5-GS-7] Vision, speech media processing

Tue. Jun 6, 2023 5:00 PM - 7:00 PM Room O (E1+E2)

座長:真矢 滋(東芝) [現地]

6:20 PM - 6:40 PM

[1O5-GS-7-05] Multi Aspect Ratio Vision Transformer for Predicting Display Advertising Effects

〇Naoto Tanji1, Toshihiko Yamasaki2 (1. Septeni Japan, Inc., 2. The University of Tokyo)

Keywords:deep learning, computer vision, online advertisements, Transformer

For better production of effective online advertising, predicting its effectiveness in advance is of prime importance. Since the images of display advertisements distributed on the internet have various aspect ratios, the effectiveness of the advertising images can be more accurately predicted by taking the aspect ratio of the images into account. We propose a Vision Transformer model that can handle images of arbitrary aspect ratios using relative position bias. We apply it to the task of click through rate prediction using real advertising delivery data, and confirm its superiority over baseline models that resize images to a fixed aspect ratio.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password