Exploiting frequency-scaling invariance properties of the scale transform for automatic speech recognition

Srinivasan Umesh; Rose R.C.; Parthasarathy S.

Profiles Research Units Publications

Other

Exploiting frequency-scaling invariance properties of the scale transform for automatic speech recognition

, Rose R.C., Parthasarathy S.

Published in International Speech Communication Association

2000

Abstract

An experimental study of the application of scale-transform to improve the performance of speaker independent continuous speech recognition, is presented in this paper. Three major results are described. First, a comparison was made between the scale-transform based magnitude cepstrum coefficients (STCC) and mel-scale filter bank cepstrum coefficients (MFCC) on a telephone based connected digit recognition task. It was shown that the STCC can obtain a performance that is close to that of the MFCC. Second, a simple frequency-normalization procedure was applied to the scale-transform representation that improved performance on the connected digit recognition task with respect to the MFCC. Finally, in a more controlled experimental setting using the TIMIT database, it was shown that the application of phone-specific frequency warpings improved phone classification performance over using a single speaker-specific warping. This last result may have general implications for all frequency warping based speaker normalization procedures.

About the journal

Journal	6th International Conference on Spoken Language Processing, ICSLP 2000
Publisher	International Speech Communication Association
Open Access	No

Authors (1)

Srinivasan Umesh
- Department of Electrical Engineering

ABOUT IIT MADRAS

R & D

RANKINGS & ACHIEVEMENTS

QUICK FIND