JSAI2020

Presentation information

International Session

International Session » E-4 Robots and real worlds

[2G1-ES-4] Robots and real worlds: Applied machine learning

Wed. Jun 10, 2020 9:00 AM - 10:20 AM Room G (jsai2020online-7)

Chair: Hiroshi Yamakawa (The University of Tokyo)

9:40 AM - 10:00 AM

[2G1-ES-4-03] Learning 2D Sound Source Localization for Microphone Array Layout Invariance using Explicit Transformation Layer

〇Phongtharin Vinayavekhin1, Guillaume Le Moing2,1, Jayakorn Vongkulbhisal1, Don Joven Agravante1, Tadanobu Inoue1, Asim Munawar1, Ryuki Tachibana1 (1. IBM Research Tokyo, 2. MINES ParisTech - PSL Research University)

Keywords:2D Sound Localization, Microphone Arrays, Deep Learning

We tackle the task of localizing the 2D Cartesian coordinates of sound source(s) in an enclosed environment by using multiple microphone arrays. Recently, deep learning has led to promising results for this task due to its robustness to noise and reverberations in the environment. However, a large amount of labeled data is required and the resulting model only works well for the microphone array layout in the training data. Recording and labeling data in all of the desired layouts becomes very costly and tedious. This paper proposes a solution to this problem by using an explicit transformation layer embedded in the neural network. Our results in simulated acoustic environments show that the method allows the model to be trained with the data from specific microphone array layouts while generalizing well to data in various unseen layouts during inference.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password