Keywords:Deep Learning, Generation of Facial Animation, Entertainment, Voice
Expressive facial animation has an important role in communication. Some avatars can express them using Face Tracking, that is one of the typical facial expression synchronization methods, but facial expressions cannot be created from previously recorded speech or synthetic speech without facial expressions. In this study, we propose a method to generate facial animation using only voice. Specifically, a learning model is designed using the acoustic features of the uttered speech as input and the parameters of the Action Unit (AU) analyzed from the facial expression video as teacher data. The experimental results indicated that the loss value of our proposed method was lower than that of the existing method. In addition, the activities of AUs by proposed method fluctuated smoother than the existing method. It will be perceived as natural facial expression.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.