JSAI2020

Presentation information

Interactive Session

[3Rin4] Interactive 1

Thu. Jun 11, 2020 1:40 PM - 3:20 PM Room R01 (jsai2020online-2-33)

[3Rin4-59] Improving Spectrograms for Sound Enhancement based on Image-to-image Translation

〇Yoshiaki KUROSAWA1, Kazuya MERA1, Toshiyuki TAKEZAWA1 (1.Graduate School of Information Sciences, Hiroshima City University)

Keywords:deep learning, sound enhancement, image transform

We aimed to examine well-known image-to-image translation technique, so-called pix2pix based on deep neural networks. Focusing on time-frequency analysis and implementing auxiliary classifier generative adversarial networks (ACGAN), we estimated the transform performance of spectrograms for sound enhancement. As a result using an image index, SSIM, we confirmed to slightly improve its performance compared to the original research.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password