Header menu link for other important links
X
Improved phone-cluster adaptive training acoustic model
Published in Institute of Electrical and Electronics Engineers Inc.
2016
Abstract
Phone-cluster adaptive training (Phone-CAT) is a subspace based acoustic modeling technique inspired from cluster adaptive training (CAT) and subspace Gaussian mixture model (SGMM). This paper explores three extensions, viz., increasing phonetic subspace dimension, including sub-states and speaker subspace, to the basic Phone-CAT model to improve its recognition performance. The latter two extensions are similar in implementation as that of SGMM as both acoustic models share a similar subspace framework. But, since the phonetic subspace dimension of Phone-CAT is constrained to be equal to the number of monophones, the first extension is not straightforward to implement. We propose a Two-stage Phone-CAT model where we increase the phonetic subspace dimension to that of the number of monophone states. This model will still be able to retain the center phone capturing property of the state-specific vectors in basic Phone-CAT. Experiments done on 33-hour train subset of Switchboard database shows improvements in recognition performance of basic Phone-CAT model with the inclusion of the proposed extensions. © 2016 IEEE.
About the journal
JournalData powered by Typeset2016 International Conference on Signal Processing and Communications, SPCOM 2016
PublisherData powered by TypesetInstitute of Electrical and Electronics Engineers Inc.
Open AccessNo
Concepts (10)
  •  related image
    Gaussian distribution
  •  related image
    Linguistics
  •  related image
    Signal processing
  •  related image
    Vectors
  •  related image
    Acoustic model
  •  related image
    CLUSTER ADAPTIVE TRAINING
  •  related image
    MONOPHONES
  •  related image
    SUBSPACE BASED
  •  related image
    SUBSPACE GAUSSIAN MIXTURE MODELS
  •  related image
    Telephone sets