Header menu link for other important links
Improved cepstral mean and variance normalization using Bayesian framework
Published in
Pages: 156 - 161
Cepstral Mean and Variance Normalization (CMVN) is a computationally efficient normalization technique for noise robust speech recognition. The performance of CMVN is known to degrade for short utterances, due to insufficient data for parameter estimation and loss of discriminable information as all utterances are forced to have zero mean and unit variance. In this work, we propose to use posterior estimates of mean and variance in CMVN, instead of the maximum likelihood estimates. This Bayesian approach, in addition to providing a robust estimate of parameters, is also shown to preserve discriminable information without increase in computational cost, making it particularly relevant for Interactive Voice Response (IVR)-based applications. The relative WER reduction of this approach w.r.t. Cepstral Mean Normalization, CMVN and Histogram Equalization are (i) 40.1%, 27% and 4.3% with the Aurora2 database for all utterances, (ii) 25.7%, 38.6% and 30.4% with the Aurora2 database for short utterances, and (iii) 18.7%, 12.6% and 2.5% with the Aurora4 database. © 2013 IEEE.
About the journal
Journal2013 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2013 - Proceedings
Open AccessNo
Concepts (9)
  •  related image
  •  related image
  •  related image
  •  related image
  •  related image
  •  related image
    Bayesian networks
  •  related image
    Database systems
  •  related image
    Maximum likelihood estimation
  •  related image
    Speech recognition