JSAI2025

Presentation information

Poster Session

Poster session » Poster Session

[2Win5] Poster session 2

Wed. May 28, 2025 3:30 PM - 5:30 PM Room W (Event hall D-E)

[2Win5-61] Control of Speech Synthesis Using Music-oriented Constraints

〇Yume Sato1, Katsuhito Sudoh1 (1.Nara Women's University)

Keywords:Speech Synthesis

This study aims to generate natural speech by incorporating music-oriented constraints into speech synthesis to enhance emotional and expressive qualities.
We propose a method using Style-Bert-VITS2 to integrate pitch-related musical elements into transcriptions.
The model is trained using the PJS corpus with pitch constraints and additional speech-text pairs from CSJ.
Experimental results demonstrate that the generated speech effectively reflects the imposed constraints.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password