Keywords:multimodal emotion estimation, Decoded Neurofeedback, Long short-term memory
Emotion estimation method using Electroencephalogram (EEG) can estimate human’s emotion directly because it is not influenced by biases of expression and recognition process. However, it is difficult that all users put on the EEG headset at any time because it is so expensive and uncomfortable. Therefore, this paper propose a method to estimate emotion by substituting voice and facial expression for EEG. A Long short-term memory (LSTM) is utilized for machine learning method to deal with the transitions of facial expression and tone of voice. The acoustic features are the time-series variations of F0, voicing probability, and loudness calculated from a voice, and the facial expression features are the time-series variations of 12 kinds of facial parts. The experimental result indicated that the method just used facial expression features estimated the user’s emotions better than the other experimental conditions. The accuracies of valence, arousal, expectation were 0.621, 0.657, and 0.578, respectively.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.