Applying Convolutional Neural Networks to the Analysis of Mouse Ultrasonic Vocalizations
* Presenting author
Many species in diverse taxonomic groups, including rodents, bats, and insects, communicate with complex ultrasonic vocalizations (USVs) (>20 kHz). Two main components of processing and analyzing USV recordings include detection and classification of syllable types. Recently we developed an efficient algorithm for detecting mouse USVs (Automatic Mouse Ultrasound Detector (A-MUD)). The main challenge is detecting USVs under conditions with a low signal-to-noise ratio, which results in high rates of false positives (FP). Mice produce many short USVs (< 10 ms), which especially inflate FPs. We aimed to improve the detection of mouse USVs with A-MUD by classifying vocalizations into three discrete syllable types (with 0, 1, or ≥2 frequency-jumps) or FPs. Supervised Convolutional Neural Networks (CNNs) were fed by 2D Gammatone Filtered Spectrograms (GFSs) adapted to the frequency range of mice. Evaluation of performance shows that CNNs yielded an overall accuracy of 95±1.2% and macro-F1 score of 90±2.7%. In contrast, Multilayer feed-forward neural networks provided an overall accuracy of only 85.4±1.9% and macro-F1 score of 75.4±2.9%, which indicates that CNNs outperformed this conventional classification method.