Efficient speaker and noise normalization for robust speech recognition

Raghavendra R. Bilgi; Srinivasan Umesh

Profiles Research Units Publications

Conferences

Efficient speaker and noise normalization for robust speech recognition

Raghavendra R. Bilgi,

Published in

2011

Pages: 2601 - 2604

Abstract

In this paper, we describe a computationally efficient approach for combining speaker and noise normalization techniques. In particular, we combine the simple yet effective Histogram Equalization (HEQ) for noise compensation with Vocal-tract length normalization (VTLN) for speaker-normalization. While it is intuitive to remove noise first and then perform VTLN, this is difficult since HEQ performs noise compensation in the cepstral domain, while VTLN involves warping in spectral domain. In this paper, we investigate the use of the recently proposed T-VTLN approach to speaker normalization where matrix transformations are directly applied on cepstral features. We show that the speaker-specific warp-factors estimated even from noisy speech using this approach closely match those from clean-speech. Further, using sub-band HEQ (S-HEQ) and TVTLN we get a significant relative improvement of 20% and an impressive 33.54% over baseline in recognition accuracy for Aurora-2 and Aurora-4 task respectively. Copyright © 2011 ISCA.

About the journal

Journal	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
ISSN	19909772
Open Access	No

Authors (1)

Srinivasan Umesh
- Department of Electrical Engineering

Concepts (8)

HEQ
NOISE COMPENSATION
Robust features
Subbands
T-VTLN
VTLN
Face recognition
Speech recognition

ABOUT IIT MADRAS

R & D

RANKINGS & ACHIEVEMENTS

QUICK FIND