Improving Spectrograms for Sound Enhancement based on Image-to-image Translation

Yoshiaki KUROSAWA; Kazuya MERA; Toshiyuki TAKEZAWA

[3Rin4-59] Improving Spectrograms for Sound Enhancement based on Image-to-image Translation

〇Yoshiaki KUROSAWA¹, Kazuya MERA¹, Toshiyuki TAKEZAWA¹ (1.Graduate School of Information Sciences, Hiroshima City University)

Keywords:deep learning, sound enhancement, image transform

We aimed to examine well-known image-to-image translation technique, so-called pix2pix based on deep neural networks. Focusing on time-frequency analysis and implementing auxiliary classifier generative adversarial networks (ACGAN), we estimated the transform performance of spectrograms for sound enhancement. As a result using an image index, SSIM, we confirmed to slightly improve its performance compared to the original research.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[3Rin4] Interactive 1

[3Rin4-59] Improving Spectrograms for Sound Enhancement based on Image-to-image Translation

Password