[3Yin2-37] speak like a dog!
dog speech synthesis using non-parallel voice conversion with deep learning
Keywords:Voice Conversion
In this study, we propose a method to convert human speech into dog-like speech while retaining the linguistic information.
One type of board game is a Table Talk Role-Playing Game~(TRPG), which has a wide variety of imaginary creatures such as goblins and zombies.
Voice Conversion~(VC) may be used to represent the voices of such imaginary creatures.
To achieve this goal, we conducted comparison experiment between two audio features~(mel-cepstral coefficients and mel-spectrogram), two non-parallel VC methods~(Variational autoencoder based and generative adversarial network based) and five kernel sizes.
Although we have been able to convert human voice into dog voice in a fragmented manner, it is difficult to maintain the linguistic information and further improvements are needed.
One type of board game is a Table Talk Role-Playing Game~(TRPG), which has a wide variety of imaginary creatures such as goblins and zombies.
Voice Conversion~(VC) may be used to represent the voices of such imaginary creatures.
To achieve this goal, we conducted comparison experiment between two audio features~(mel-cepstral coefficients and mel-spectrogram), two non-parallel VC methods~(Variational autoencoder based and generative adversarial network based) and five kernel sizes.
Although we have been able to convert human voice into dog voice in a fragmented manner, it is difficult to maintain the linguistic information and further improvements are needed.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.