Restoring Lost Speech Components with Generative Adversarial Networks for Speech Communications in Adverse Conditions
* Presenting author
Speech enhancement has been widely implemented to restore the speech quality for speech communication between humans or human and machine. For different speech communication scenarios and channel/environmental conditions, the types and degrees of speech distortion could vary significantly and many speech enhancement strategies have been developed accordingly. This study deal with a severe distortion problem, i.e., part of the spectral and/or temporal components of the speech are completely lost. The spectral loss is simulated by a transmission channel with very narrow passing bandwidth (lower than 2 kHz) which results in severely degraded speech quality; the temporal loss is simulated by packet loss up to 20% percent in massive communication which results in poor speech intelligibility. A generative adversarial networks (GAN) based speech enhancement scheme is proposed for restoring the missing spectral and temporal components with different network structure and parameters. A set of experiments have been conducted to evaluate the effectiveness of proposed enhancement scheme and promising results have achieved.