4:00 PM - 4:20 PM
[3G5-GS-6-02] A Speech Dialogue System Utilizing Voice Activity Prediction and Objective Evaluation of Naturalness
Keywords:Large Language Model, Dialogue Systems, Turntaking, Natural Language Processing, Interjections
With the advancement of natural language processing technologies, dialogue systems that handle continuous speech are becoming increasingly prevalent. In particular, the responses of dialogue systems that provide backchanneling can disrupt natural conversation due to delays in response speed and interruptions during speech. However, evaluating these systems is challenging because it is difficult to separate backchanneling from the main dialogue. In this study, we focus on turn-taking to achieve natural interactions that include backchanneling, and we have developed a dialogue system utilizing Voice Activity Projection (VAP). This system predicts the start and end times of conversations, allowing for the distinction between backchanneling and interruptive speech. Experiments have confirmed improvements in naturalness, indicating its effectiveness for future dialogue system development.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.