Speech Enhancement

Speech Enhancement is one of our central research topics. There are many applications such as mobile voice communications, hearing aids and human-machine interfaces - and there are many methods. We focus on noise reduction with the goal to improve listener comfort and to increase the intelligibility of the acoustic signal. We employ methods based on single microphone signals as well as multiple microphone signals (microphone arrays). The development of speech enhancement methods requires a blend of physical modeling and statistical signal processing techniques. Most of our enhancement techniques operate in the spectral domain. Typically, the noisy speech signal is segmented into short frames, transformed, enhanced, inverse transformed, and overlap-added to reconstruct the enhanced signal (see Figure). The benefits of spectral processing are

  • a concentration of speech energy in few spectral parameters (especially for voiced speech),
  • a simpler statistical description as compared to the time domain, and
  • possibly an application of psychoacoustic principles.

The block diagram of a typical system is shown below, [Malah et al., 2004].

block diagram

Martin, R.: Statistical Methods for the Enhancement of Noisy Speech. In: Speech Enhancement, J. Benesty et al. (eds), Springer-Verlag 2005

Martin, R., Malah, D., Cox, R.V., Accardi, A.J. (2004). "A Noise Reduction Preprocessor for Mobile Voice Communication, JASP No.8, pp. 1046-1058."

Martin, R.: Statistical Methods for the Enhancement of Noisy Speech. Proc. Intl. Workshop Acoustic Echo and Noise Control (IWAENC), pp. 1-6, 2003.

Breithaupt, C; Martin, R.: MMSE Estimation of Magnitude-Squared DFT Coefficients with Supergaussian Priors. Proc. IEEE Intl. Conf. Acoustics, Speech, Signal Processing (ICASSP), 2003.