Speaker-independent acoustic features extraction using StarGAN-VC and its applications for double articulation analysis

Soichiro Komura

11:20 AM - 11:40 AM

[4I2-GS-7c-02] Speaker-independent acoustic features extraction using StarGAN-VC and its applications for double articulation analysis

〇Soichiro Komura¹, Kaede Hayashi¹, Akira Taniguchi¹, Tadahiro Taniguchi ¹, Hirokazu Kameoka² (1. Ritsumeikan University, 2. NTT Communication Science Laboratories)

Keywords:NPB-DAA, StarGAN-VC, Neuro-SERKET, Unsupervised learning

Nonparametric Bayesian double articulation analyzer (NPB-DAA) is a method to discover words and phoneme units from continuous speech signals in an unsupervised manner. However, acoustic features have speaker-dependency, and it prevent NPB-DAA from discovering words and phonem units from multi-speaker utterances. This paper proposes to use star generative adversarial network for voice conversion (StarGAN-VC) to extract speaker-independent acoustic features and optimize NPB-DAA and StarGAN-VC simultaneously by using mutual learning based on Neuro-SERKET framework. The effect of mutual learning is shown through an experiment.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[4I2-GS-7c] 画像音声メディア処理：音声認識と指示理解

[4I2-GS-7c-02] Speaker-independent acoustic features extraction using StarGAN-VC and its applications for double articulation analysis

Password