Digital Signal Processing Group:
Robust Speech Recognition
After many years of deployment, still automatic speech recognition is too dependent on a high-quality audio input to be useful in many interesting real-life applications.
It is the goal of our research group, to counter these shortcomings of standard ASR and attain highly robust automatic speech recognition. For this purpose, we are working on a range of approaches:
Ideally, we are using a number of microphones (for blind source separation) and of modalities (in audiovisual speech recognition) in order to provide reliable human-machine interaction also in very noisy and / or reverberant environments.
In addition, the microphone signals are typically pre-processed to remove noise and reverberation via statistical speech signal processing. This removal of environmental distortions has been shown to significantly improve ASR accuracy, both for ASR systems trained on clean data and also, due to their inherent variance reduction, for ASR with multi-condition training.
In conjunction with all of the above methods, we are also transmitting reliability information from the preprocessing stages to the ASR on a feature-by-feature and frame-by-frame basis. This allows us to weigh the reliable components of the available information more than the unreliable ones, when searching for the overall recognition output. This uncertainty-of-observation approach to robust ASR is beneficial both for multi-channel and single-channel ASR; for a range of examples, see Robust Speech Recognition of Uncertain or Missing Data.
Pattern Recognition for Communication and Technical Diagnostics
While originally developed for robust speech recognition, the idea of using time-variant reliability information is also valuable for pattern analysis and recognition in general.
In collaboration with institutional and industrial partners, we are currently working on extending the idea of uncertainty-of-observation techniques both to more reliable data transmission and to the fault diagnosis of complex technical systems.