Header menu link for other important links
X
Non-negative subspace projection during conventional MFCC feature extraction for noise robust speech recognition
Raghavendra R. Bilgi,
Published in Institute of Electrical and Electronics Engineers Inc.
2013
Abstract
An additional feature processing algorithm using Non-negative Matrix Factorization (NMF) is proposed to be included during the conventional extraction of Mel-frequency cepstral coefficients (MFCC) for achieving noise robustness in HMM based speech recognition. The proposed approach reconstructs log-Mel filterbank outputs of speech data from a set of building blocks that form the bases of a speech subspace. The bases are learned using the standard NMF of training data. A variation of learning the bases is proposed, which uses histogram equalized activation coefficients during training, to achieve noise robustness. The proposed methods give up to 5.96% absolute improvement in recognition accuracy on Aurora-2 task over a baseline with standard MFCCs, and up to 13.69% improvement when combined with other feature normalization techniques like Histogram Equalization (HEQ) and Heteroscedastic Linear Discriminant Analysis (HLDA). © 2013 IEEE.
About the journal
JournalData powered by Typeset2013 National Conference on Communications, NCC 2013
PublisherData powered by TypesetInstitute of Electrical and Electronics Engineers Inc.
ISSN15503607
Open AccessNo
Concepts (13)
  •  related image
    Face recognition
  •  related image
    Factorization
  •  related image
    Feature extraction
  •  related image
    Graphic methods
  •  related image
    CONVENTIONAL EXTRACTION
  •  related image
    FEATURE NORMALIZATION
  •  related image
    HISTOGRAM EQUALIZATIONS
  •  related image
    Linear discriminant analysis
  •  related image
    MEL-FREQUENCY CEPSTRAL COEFFICIENTS
  •  related image
    NOISE ROBUST SPEECH RECOGNITION
  •  related image
    NOISE ROBUSTNESS
  •  related image
    Nonnegative matrix factorization
  •  related image
    Speech recognition