[4Xin2-34] Standardization for Absorbing Variations in Pause Duration Distribution in Pause Duration Estimation for Reading-Style Speech Synthesis
Keywords:Reading-Style Speech Synthesis, Pause Duration Estimation, Standardization, natural language processing
In storytelling speech, the distribution of pause durations varies due to differences in the text, the reader, and whether the text is spoken lines or not. In this study, we attempted to absorb these differences by standardizing the pause durations in the training data when learning to predict the pause position and pause duration based on the text to be read aloud. We found that standardization within each audiobook was the most effective among several standardization methods.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.