Assessing the role of onsets for musical instrument identification in an auditory modeling framework
* Presenting author
Sound onsets are long known to provide important cues for musical instrument identification by human listeners. It has yet remained unclear whether this circumstance rests on perceptual or acoustical grounds: Are listeners utilizing informative features that are only available in the onset, or are there equally informative acoustic features throughout sounds that listeners tend to ignore because of their redundancy? Here we approach this question by using the automatic speech recognition-based simulation framework for auditory discrimination experiments (FADE) [Schädler et al., 2016, JASA] to model data from a recent study on instrument identification [Siedenburg, under review]. There, listeners were tasked to identify Western orchestral instruments from 64 ms segments extracted from the onset or from the middle portion of the sounds. In the present study, instrument identification is modeled using FADE with three different feature sets: Log-Mel spectrograms, Mel-Frequency Cepstral Coefficients, and separable Gabor filter bank features (SGBFB). Results indicate that all three feature sets yield a strong decrease of classification performance for the middle portions of the sounds. This suggests that the utility of onsets for sound identification may not be based on specific properties of human auditory perception but on the particular acoustic richness of sound onsets.