Header menu link for other important links
X
Robust speech recognition through selection of speaker and environment transforms
Raghavendra R. Bilgi,
Published in
2012
Pages: 4333 - 4336
Abstract
In this paper, we address the problem of robustness to both noise and speaker-variability in automatic speech recognition (ASR). We propose the use of pre-computed Noise and Speaker transforms, and an optimal combination of these two transforms are chosen during test using maximum-likelihood (ML) criterion. These pre-computed transforms are obtained during training by using data obtained from different noise conditions that are usually encountered for that particular ASR task. The environment transforms are obtained during training using constrained-MLLR (CMLLR) framework, while for speaker-transforms we use the analytically determined linear-VTLN matrices. Even though the exact noise environment may not be encountered during test, the ML-based choice of the closest Environment transform provides "sufficient" cleaning and this is corroborated by experimental results with performance comparable to histogram equalization or Vector Taylor Series approaches on Aurora-2 task. The proposed method is simple since it involves only the choice of pre-computed environment and speaker transforms and therefore, can be applied with very little test data unlike many other speaker and noise-compensation methods. © 2012 IEEE.
About the journal
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN15206149
Open AccessNo
Concepts (16)
  •  related image
    Automatic speech recognition
  •  related image
    ENVIRONMENT ADAPTATION
  •  related image
    HISTOGRAM EQUALIZATIONS
  •  related image
    MAXIMUM LIKELIHOOD CRITERION
  •  related image
    Noise conditions
  •  related image
    NOISE ENVIRONMENTS
  •  related image
    NOISE-COMPENSATION
  •  related image
    OPTIMAL COMBINATION
  •  related image
    ROBUST SPEECH RECOGNITION
  •  related image
    SPEAKER ADAPTATION
  •  related image
    Test data
  •  related image
    VECTOR TAYLOR SERIES APPROACH
  •  related image
    Robustness (control systems)
  •  related image
    Signal processing
  •  related image
    Speech recognition
  •  related image
    Acoustic noise