Header menu link for other important links
X
Fast computation of speaker characterization vector using MLLR and sufficient statistics in anchor model framework
Published in
2010
Pages: 2738 - 2741
Abstract
Anchor modeling technique has been shown to be useful in reducing computational complexity for speaker identification and indexing of large audio database. In this technique, speakers are projected onto a talker space spanned by a set of predefined anchor models which are usually represented by Gaussian Mixture Models (GMMs). The characterization of each speaker involves calculation of likelihood with each of the anchor models, and is therefore expensive even in the GMM Universal Background model (GMM-UBM) frame work using top-C mixtures per feature vector. In this paper, we propose a computationally efficient (Fast) method to calculate the likelihood of the speech utterances using anchor speaker-specific Maximum Likelihood Linear Regression (MLLR) matrices and sufficient statistics estimated from the utterance. We show that the proposed method is faster by an order of magnitude for evaluating the speaker characterization vector. Since anchor models use simple distance measures to identify speakers, they are used as a first stage to select N probable speakers and then cascaded by a conventional GMM-UBM stage which finally identifies the speaker from this reduced set. We show that the proposed method in cascade combination perform 4.21× faster than the conventional cascade anchor model system with comparable performance. The experiments are performed on NIST 2004 SRE in core condition. © 2010 ISCA.
About the journal
JournalProceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
Open AccessNo
Concepts (23)
  •  related image
    ANCHOR MODELS
  •  related image
    AUDIO DATABASE
  •  related image
    Computationally efficient
  •  related image
    DISTANCE MEASURE
  •  related image
    Fast computation
  •  related image
    FAST MLLR
  •  related image
    Feature vectors
  •  related image
    Frame-work
  •  related image
    GAUSSIAN MIXTURE MODELS
  •  related image
    GMM-UBM
  •  related image
    MAXIMUM LIKELIHOOD LINEAR REGRESSION
  •  related image
    Modeling technique
  •  related image
    SPEAKER IDENTIFICATION
  •  related image
    SPEECH UTTERANCE
  •  related image
    SUFFICIENT STATISTICS
  •  related image
    UNIVERSAL BACKGROUND MODEL
  •  related image
    C (programming language)
  •  related image
    Characterization
  •  related image
    Computational complexity
  •  related image
    INDEXING (MATERIALS WORKING)
  •  related image
    LOUDSPEAKERS
  •  related image
    Maximum likelihood
  •  related image
    Speech recognition