JSAI2022

Presentation information

Interactive Session

General Session » Interactive Session

[3Yin2] Interactive session 1

Thu. Jun 16, 2022 11:30 AM - 1:10 PM Room Y (Event Hall)

[3Yin2-20] End-to-end training of Object Segmentation Task and Video Question-Answering Task

〇Hidemoto Nakada1, Hideki Asoh1 (1.National Institute of Advanced Industrial Science and Technology)

Keywords:VAQ, Object detection

For complicated VQA tasks that incorporates multiple objects, to train the VQA model using segmented objects data as inputs is proved to be effective for various downstream tasks. In this work we tried to train the VQA task model and object segmentation model in end-to-end fashion instead of training independently. We employed CLEVRER as a target VQA task. We first trained MONet, an object segmentation network, with the dataset, and trained Aloe, a VQA model, using the output of the trained MONet. Finally we connect MONet ans Aloe to finetune them in end-to-end setting and confirmed that the performance of VQA task has been improved.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password