Improved chirp group delay based algorithms with applications to vocal tract estimation and speech recognition

M. K. Jayesh; C. S. Ramalingam

doi:10.1016/j.specom.2016.02.003

Profiles Research Units Publications

Articles

Improved chirp group delay based algorithms with applications to vocal tract estimation and speech recognition

M. K. Jayesh,

Published in Elsevier B.V.

2016

DOI: 10.1016/j.specom.2016.02.003

Volume: 81

Pages: 72 - 89

Abstract

In this paper we propose two algorithms for estimating the vocal tract from the Fourier transform phase of a given speech segment. In the first approach, we find the zeros of the z-transform, reflect all outside-unit-circle zeros inside, and then compute the chirp group delay spectrum. This method eliminates many of the drawbacks in Bozkurt's CGDGCI method, and is able to model well the spectral valleys present. In the case of high pitch sounds, the vocal tract estimate in the proposed method is corrupted by source oscillations. In the second approach, by casting the problem within the framework of Independent Component Analysis, we propose a method wherein these effects are considerably suppressed. ASR results on the TIMIT database using features derived from the first method are comparable to those obtained using MFCC features. Further improvement in the recognition accuracy (compared with the baseline MFCC) was obtained by using lattice combining technique, resulting in a Phone Error Rate of 17%. Also, by using our abilities to model spectral valleys well, we propose additional features that are able to distinguish the nasals /m/, /n/, and /ng/, which in turn lead to an increase in their recognition accuracy. © 2016

About the journal

Journal	Data powered by TypesetSpeech Communication
Publisher	Data powered by TypesetElsevier B.V.
ISSN	01676393
Open Access	No

Authors (1)

C. S. Ramalingam
- Department of Electrical Engineering

Concepts (12)

Group delay
Independent component analysis
Z TRANSFORMS
Combining techniques
FOURIER TRANSFORM PHASE
GROUP DELAY SPECTRUMS
PHASE PROCESSING
Phone error rate
Recognition accuracy
Speech segments
Vocal-tracts
Speech recognition

ABOUT IIT MADRAS

R & D

RANKINGS & ACHIEVEMENTS

QUICK FIND