14:30 〜 14:50
[2K4-ES-2-03] Many-to-many Voice Conversion based on a CycleGAN using a Radial Loss
キーワード:voice conversion, deep learning, cycle gan, gans, parallel-data free
Voice conversion (VC) is a technique that allows a person to speak with the voice of another person. It's one of applications of voice processing that depends on both signal processing and machine learning to achieve it. In this paper we propose a many-to-many voice conversion method based on a CycleGan which we call the Radial CycleGan. In this method, generators consist of a general encoder(ENC-0) and general decoder(DEC-0 for a standard voice sample (TTS voice) and a pair of an encoder and decoder for any new voice sample. We define radial loss between encoders and decoders in addition to commonly used cycle and identity losses to train generators and discriminators. The process of training for each new user aims to train a new pair of (encoder, decoder) on the standard pair of TTS which makes it possible to convert voices directly on the trained pair of encoder and decoder of the training. This method will contribute in creating real-time systems that are able to convert among pretrained speaker’s voices in a robust way and easy to add new users through collecting small datasets from them.
講演PDFパスワード認証
論文PDFの閲覧にはログインが必要です。参加登録者の方は「参加者用ログイン」画面からログインしてください。あるいは論文PDF閲覧用のパスワードを以下にご入力ください。