JSAI2025

Presentation information

Poster Session

Poster session » Poster Session

[3Win5] Poster session 3

Thu. May 29, 2025 3:30 PM - 5:30 PM Room W (Event hall D-E)

[3Win5-47] Improving the Noise Robustness of Environmental Sound Classification Models through CNN Selection and Data Augmentation Techniques

〇Yoshiki Ito1, Masato Inoue1 (1.Waseda University)

Keywords:Environment Sound Classification, CNN, Data Augmentation

Environmental Sound Classification (ESC) is a crucial technology for understanding surrounding conditions, and recently, methods utilizing the Vision Transformer framework have attracted attention. However, Transformer models are prone to overfitting when data is insufficient, and pre-trained models may not adapt well to the target sound environment. On the other hand, CNNs exhibit stable performance even without pre-training and with small amounts of data, offering the advantage of reducing the impact of noise through convolutional processing's denoising capabilities. Therefore, this study focuses on the noise resistance of CNNs and examines the selection of the optimal CNN and the introduction of data augmentation techniques. First, five proven CNNs were compared, and then the data augmentation technique CutMix was introduced to improve performance with noisy data. The results showed that EfficientNet exhibited excellent noise resistance, and CutMix improved overall classification performance. These findings contribute to the practical application of high-accuracy and noise-resistant ESC models.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password