Keywords:Keyword Spotting, Neural Network, Deep Learning
In this paper, we propose neural network models based on the neural ordinary differential equation (NODE) for small-footprint keyword spotting (KWS). KWS, which detects pre-defined keyword from input audio data, draws much attention as a promising technique to realize so-called “voice user interface” that can control mobile phones and smart speakers by voice. Recently, many researchers have demonstrated KWS with artificial neural networks and have achieved high inference accuracy. Voice-controlled devices are, however, usually battery-operated, and hence memory footprint and compute resources are severely restricted. To cope with this restriction, we present techniques to apply NODE to KWS that make it possible to reduce the number of parameters and computations during inference. Finally, we show that the number of model parameters of the proposed model is smaller by 68% than that of the conventional KWS model.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.