Presentation information

Organized Session

Organized Session » OS-22

[3D5-OS-22b] OS-22 (2)

Thu. Jun 11, 2020 3:40 PM - 5:20 PM Room D (jsai2020online-4)

上野 未貴(大阪工業大学)、森 直樹(大阪府立大学)、はたなか たいち(株式会社クリエイターズインパック)

3:40 PM - 4:00 PM

[3D5-OS-22b-01] Corresponding identification between comic images and dialog using distributed representation

〇Akira Terauchi1, Naoki Mori1, Miki Ueno2 (1. Osaka Prefecture University, 2. Osaka Institute of Technology)

Keywords:Comic Engineering, Multi-modal analysis, Convolutional AutoEncoder

The research of understanding human creations such as comics, novels, and music by artificial intelligence (AI) has become an attractive research topic in AI fields. However, creating an interesting story or comic is still one of the difficult tasks because it requires lots of human creativity. In this study, we focus on that AI can understand comics or not by using four-scene comics because four-scene comics have a clear structure and format. Lots of studies using the image or natural language models have been proposed in such tasks. However, there are few studies using a combination of images and natural language features as multi-modal data. In this study, we proposed the method of combining images and languages to understand four-scene comics utilizing deep learning. The effectiveness of the proposed method is confirmed by computer simulations taking koma prediction problems of four-scene comics as examples.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.