JSAI2020

Presentation information

Interactive Session

[3Rin4] Interactive 1

Thu. Jun 11, 2020 1:40 PM - 3:20 PM Room R01 (jsai2020online-2-33)

[3Rin4-04] Generation of Facial Animation from Voice using End-to-End Learning

〇Hirofumi Omichi1, Kazuya Mera1, Yoshiaki Kurosawa1, Toshiyuki Takezawa1 (1.Graduate School of Information Sciences, Hiroshima City University)

Keywords:Deep Learning, Generation of Facial Animation, Entertainment, Voice

Expressive facial animation has an important role in communication. Some avatars can express them using Face Tracking, that is one of the typical facial expression synchronization methods, but facial expressions cannot be created from previously recorded speech or synthetic speech without facial expressions. In this study, we propose a method to generate facial animation using only voice. Specifically, a learning model is designed using the acoustic features of the uttered speech as input and the parameters of the Action Unit (AU) analyzed from the facial expression video as teacher data. The experimental results indicated that the loss value of our proposed method was lower than that of the existing method. In addition, the activities of AUs by proposed method fluctuated smoother than the existing method. It will be perceived as natural facial expression.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password