Header menu link for other important links
X
Computationally efficient speaker identification using fast-MLLR based anchor modeling
Published in
2012
Pages: 4357 - 4360
Abstract
In this paper, we propose a computationally efficient method to identify a speaker from a large population of speakers. The proposed method is based on our earlier [1] Fast Maximum Likelihood Linear Linear Regression (MLLR) anchor modeling technique which provides performance comparable to the conventional anchor modeling system and yet reduces computation time significantly by computing likelihood efficiently using sufficient statistics of data and anchor specific MLLR matrix. However, both these systems still require a Gaussian Mixture Model-Universal Background Model (GMM-UBM) based back-end system to choose the optimal speaker, which is computationally heavy. In our proposed method, we show that applying Linear-Discriminant Analysis (LDA) and Within-Class-Covariance Normalization (WCCN) on the Speaker characterization Vector (SCV) of our recently proposed Fast-MLLR method, we can combine the computational efficiency and the discriminant capability to have a system that uses simple cosine-distance measure to identify speakers and yet has significantly superior performance compared to both full-blown GMM-UBM system and the anchor-model system. More importantly, there is no need of the "back-end" system. Experimental result on NIST 2004 SRE shows that the proposed method reduces identification error rate by an absolute 2% and takes only 2/3 of the time taken by efficient Fast-MLLR system and only 20% of the time taken by the stand-alone GMM-UBM system. © 2012 IEEE.
About the journal
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN15206149
Open AccessNo
Concepts (10)
  •  related image
    ANCHOR MODELS
  •  related image
    FAST MLLR
  •  related image
    LDA
  •  related image
    SPEAKER IDENTIFICATION
  •  related image
    WCCN
  •  related image
    Computational efficiency
  •  related image
    Discriminant analysis
  •  related image
    LOUDSPEAKERS
  •  related image
    Signal processing
  •  related image
    Speech recognition