JSAI2022

Presentation information

Interactive Session

General Session » Interactive Session

[3Yin2] Interactive session 1

Thu. Jun 16, 2022 11:30 AM - 1:10 PM Room Y (Event Hall)

[3Yin2-31] Open-Domain Live Commentary Generation

〇Yumi Hamazono1,3, Edison Marrese-Taylor3, Tatsuya Ishigaki3, Yusuke Miyao2,3, Ichiro Kobayashi1,3, Hiroya Takamura3 (1.Ochanomizu University, 2.The University of Tokyo, 3.National Institute of Advanced Industrial Science and Technology)

Keywords:Natural language processing, Multimodal

In a live commentary, a commentator gives objective statements or subjective comments about the events in a video in real-time.
The research about the automatic generation of such live commentary has been conventionally carried out for specific fields, such as sports, so it is common to use field-specific information to generate live commentaries.
The subject of our study, the live commentary generation for open-domain videos cannot use domain-specific features, which makes it a difficult setting.
We first construct a dataset with videos from various domains and live commentaries collected by crowdsourcing, then train a live commentary generation model that takes into account video and context.
Experiments show that a multimodal Transformer that considers video and contextual text performs well.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password