Gated convolutional neural network-based voice activity detection under high-level noise environments
* Presenting author
This paper deals with voice activity detection (VAD) tasks under high-level noise environments where signal-to-noise ratios (SNRs) are lower than -5 dB. With the increasing needs for hands-free applications, it is unavoidable to face critically low SNR situations where the noise can be internal self-created ego noise or external noise occurring in the environment, e.g., rescue robots in a disaster or navigation in a high-speed moving car. To achieve accurate VAD results under such situations, this paper proposes a gated convolutional neural network-based approach that is able to capture long- and short-term dependencies in time series as cues for detection. Experimental evaluations using high-level ego-noise of a hose-shaped rescue robot revealed that the proposed method was able to averagely achieve about 86% VAD accuracy in environments with SNR in the range of -30 dB to -5 dB.