Presentation information

Organized Session

Organized Session » OS-19

[4E2-OS-19a] OS-19 (1)

Fri. Jun 12, 2020 12:00 PM - 1:40 PM Room E (jsai2020online-5)

湯浅 将英(湘南工科大学)、岡田 将吾(北陸先端科学技術大学院大学)、酒井 元気(東京電機大学)、酒造 正樹(東京電機大学)

1:00 PM - 1:20 PM

[4E2-OS-19a-04] Automatic Generation of Responses and Facial Expressions based on Multimodal Information

〇Ryosuke Ueno1, Jie Zeng1, Yukiko Nakano1 (1. Seikei university)

Keywords:multimodal, emotional response, dialogue

It is important for Conversational Agents to display emotional and empathetic responses using verbal and nonverbal behaviors. Towards multimodal generation in Conversational Agents, this study proposes a multimodal deep learning model for predicting a category of verbal acknowledgement and facial expressions at the same time. First, unimodal encoders for audio, language, and face movement were trained, and the output of the encoders were fused to train multimodal decoder which predicts verbal acknowledgement and facial expressions. The model performs much better than a baseline model.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.