speak like a dog!

Kohei Suzuki; Shoki Sakamoto; Tadahiro Taniguchi; Hirokazu Kameoka

[3Yin2-37] speak like a dog!

dog speech synthesis using non-parallel voice conversion with deep learning

〇Kohei Suzuki¹, Shoki Sakamoto¹, Tadahiro Taniguchi¹, Hirokazu Kameoka² (1.Ritsumeikan University, 2.NTT Communication Science Laboratories)

Keywords:Voice Conversion

In this study, we propose a method to convert human speech into dog-like speech while retaining the linguistic information.
One type of board game is a Table Talk Role-Playing Game~(TRPG), which has a wide variety of imaginary creatures such as goblins and zombies.
Voice Conversion~(VC) may be used to represent the voices of such imaginary creatures.
To achieve this goal, we conducted comparison experiment between two audio features~(mel-cepstral coefficients and mel-spectrogram), two non-parallel VC methods~(Variational autoencoder based and generative adversarial network based) and five kernel sizes.
Although we have been able to convert human voice into dog voice in a fragmented manner, it is difficult to maintain the linguistic information and further improvements are needed.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[3Yin2] Interactive session 1

[3Yin2-37] speak like a dog!

dog speech synthesis using non-parallel voice conversion with deep learning

Password