Presentation information

General Session

General Session » GS-7 Vision, speech media processing

[4I2-GS-7c] 画像音声メディア処理:音声認識と指示理解

Fri. Jun 11, 2021 11:00 AM - 12:40 PM Room I (GS room 4)

座長:宮西 大樹(国際電気通信基礎技術研究所)

11:40 AM - 12:00 PM

[4I2-GS-7c-03] Parameter Reduction by Neural Ordinary Differential Equation (Neural ODE) for Small-Footprint Keyword Spotting

〇Hiroshi Fuketa1, Yukinori Morita1 (1. National Institute of Advanced Industrial Science and Technology)

Keywords:Keyword Spotting, Neural Network, Deep Learning

In this paper, we propose neural network models based on the neural ordinary differential equation (NODE) for small-footprint keyword spotting (KWS). KWS, which detects pre-defined keyword from input audio data, draws much attention as a promising technique to realize so-called “voice user interface” that can control mobile phones and smart speakers by voice. Recently, many researchers have demonstrated KWS with artificial neural networks and have achieved high inference accuracy. Voice-controlled devices are, however, usually battery-operated, and hence memory footprint and compute resources are severely restricted. To cope with this restriction, we present techniques to apply NODE to KWS that make it possible to reduce the number of parameters and computations during inference. Finally, we show that the number of model parameters of the proposed model is smaller by 68% than that of the conventional KWS model.

