JSAI2022

Presentation information

General Session

General Session » GS-7 Vision, speech media processing

[2O1-GS-7] Vision, speech media processing: generation

Wed. Jun 15, 2022 9:00 AM - 10:40 AM Room O (Room 510)

座長:栗田 修平(理化学研究所)[現地]

9:40 AM - 10:00 AM

[2O1-GS-7-03] Visual Explanation Generation Based on Lambda Attention Branch Networks

〇Tsumugi Iida1, Kanta Kaneda1, Tsubasa Hirakawa2, Takayoshi Yamashita2, Hironobu Fujiyoshi2, Komei Sugiura1 (1. Keio University, 2. Chubu University)

Keywords:Visual Explanation Generation, Attention Branch, Lambda Networks, transformer

Explanation generation for transformers enhances accountability for their predictions.
However, there have been few studies on generating visual explanations for the transformers that use multidimensional context, such as LambdaNetworks.
In this paper, we propose the Lambda Attention Branch Networks, which attend to important regions in detail and generate easily interpretable visual explanations.
We also propose the Patch Insertion-Deletion score, an extension of the Insertion-Deletion score, as an effective evaluation metric for images with sparse important regions.
Experimental results on two public datasets indicate that the proposed method successfully generates visual explanations.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password