Header menu link for other important links
Efficient speaker and noise normalization for robust speech recognition
Raghavendra R. Bilgi,
Published in
Pages: 2601 - 2604
In this paper, we describe a computationally efficient approach for combining speaker and noise normalization techniques. In particular, we combine the simple yet effective Histogram Equalization (HEQ) for noise compensation with Vocal-tract length normalization (VTLN) for speaker-normalization. While it is intuitive to remove noise first and then perform VTLN, this is difficult since HEQ performs noise compensation in the cepstral domain, while VTLN involves warping in spectral domain. In this paper, we investigate the use of the recently proposed T-VTLN approach to speaker normalization where matrix transformations are directly applied on cepstral features. We show that the speaker-specific warp-factors estimated even from noisy speech using this approach closely match those from clean-speech. Further, using sub-band HEQ (S-HEQ) and TVTLN we get a significant relative improvement of 20% and an impressive 33.54% over baseline in recognition accuracy for Aurora-2 and Aurora-4 task respectively. Copyright © 2011 ISCA.
About the journal
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Open AccessNo
Concepts (8)
  •  related image
  •  related image
  •  related image
    Robust features
  •  related image
  •  related image
  •  related image
  •  related image
    Face recognition
  •  related image
    Speech recognition