Source Localization and Speech Enhancement for Speech Recognition for Real time Environment

Detta är en Master-uppsats från Blekinge Tekniska Högskola/Sektionen för ingenjörsvetenskap

Sammanfattning: Popularity of speech communication is rapidly increasing in various contexts such as conferencing systems, mobile/fixed electronic devices and laptops thus leading to a heightened demand for new services and improved speech quality. Dictaphones used for dictations usually have one microphone. Single microphone does not give enough degree of freedom to allow estimation of location of the source. Microphone array makes use of multiple microphones for spatial filtering suppressing the background noise. This report aims for speech enhancement utilizing the benefits inherited with microphone arrays to find direction of desired speaker and focus the listening beam in that direction. A comparison is made between Generalized Cross Correlation (GCC) methods for locating the source in real office environment. Beamforming is implemented to make the microphone array listen in the desired direction thus reducing the interference from other sources. Minimum Variance Distortion-less Response (MVDR) approach is shown to give better results compared to more simplistic techniques. Perceptual based Eigen filter incorporating human hearing models in subspace incorporated in the suppressor eliminates the residual noise. Objective system performance is evaluated by estimating Signal-to-Noise-Ratio improvement (SNRI), segmental SNR, signal degradation and noise suppression. Perpetual Evaluation of Speech Quality (PESQ) gives Mean Opinion Score for subjective evaluation.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)