Header menu link for other important links
X
Noise and speaker compensation in the log filter bank domain
Raghavendra R. Bilgi,
Published in
2012
Pages: 4709 - 4712
Abstract
In this paper, we propose a method to compensate for noise and speaker-variability directly in the Log filter-bank (FB) domain, so that MFCC features are robust to noise and speaker-variations. For noise-compensation, we use Vector Taylor Series (VTS) approach in the Log FB domain, and speaker-normalization is also done in the Log FB domain using Linear Vocal tract length (VTLN) matrices. For VTLN, optimal selection of warp-factor is done in Log FB domain using canonical GMM model, avoiding the two-pass approach needed by a HMM model. Further, this can be efficiently implemented using sufficient statistics obtained from the GMM and the FB-VTLN-matrices. The warp-factor selection using GMM can also be done in cepstral domain by applying DCT matrices without the usual approximations associated with conventional linear-VTLN. The elegance of the proposed approach is that given the speech data, we obtain directly MFCC features that are robust to noise and speaker-variations. The proposed approach, show a significant relative improvement of 31% over baseline on Aurora-4 task. © 2012 IEEE.
About the journal
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN15206149
Open AccessNo
Concepts (15)
  •  related image
    CEPSTRAL DOMAIN
  •  related image
    HMM MODELS
  •  related image
    NOISE COMPENSATION
  •  related image
    OPTIMAL SELECTION
  •  related image
    SPEAKER NORMALIZATION
  •  related image
    Speech data
  •  related image
    SUFFICIENT STATISTICS
  •  related image
    TVTLN
  •  related image
    VECTOR TAYLOR SERIES
  •  related image
    VOCAL TRACT LENGTHS
  •  related image
    VTS
  •  related image
    Filter banks
  •  related image
    Matrix algebra
  •  related image
    Speech recognition
  •  related image
    Acoustic noise