Articulatory and stacked bottleneck features for low resource speech recognition

Srinivasan Umesh; Shetty V.M.; Sharon R.A.; Abraham B.; Seeram T.; Arul Prakash Karaiyan; B Ravindran

doi:10.21437/Interspeech.2018-2226

Profiles Research Units Publications

Other

Articulatory and stacked bottleneck features for low resource speech recognition

, Shetty V.M., Sharon R.A., Abraham B., Seeram T., ,

Published in International Speech Communication Association

2018

DOI: 10.21437/Interspeech.2018-2226

Volume: 2018-September

Pages: 3202 - 3206

Abstract

In this paper, we discuss the benefits of using articulatory and stacked bottleneck features (SBF) for low resource speech recognition. Articulatory features (AF) which capture the underlying attributes of speech production are found to be robust to channel and speaker variations. However, building an efficient articulatory classifier to extract AF requires an enormous amount of data. In low resource acoustic modeling, we propose to train the bidirectional long short-term memory (BLSTM) articulatory classifier by pooling data from the available low resource Indian languages, namely, Gujarati, Tamil, and Telugu. This is done in the context of Microsoft Indian Language challenge. Similarly, we train a multilingual bottleneck feature extractor and an SBF extractor using the pooled data. To bias, the SBF network towards the target language, a second network in the stacked architecture was trained using the target language alone. The performance of ASR system trained with stand-alone AF is observed to be at par with the multilingual bottleneck features. When the AF and the biased SBF are appended, they are found to outperform the conventional filterbank features in the multilingual deep neural network (DNN) framework and the high-resolution Mel frequency cepstral coefficient (MFCC) features in the time-delayed neural network(TDNN) framework. © 2018 International Speech Communication Association. All rights reserved.

About the journal

Journal	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publisher	International Speech Communication Association
ISSN	2308457X
Open Access	No

Authors (3)

Srinivasan Umesh
- Department of Electrical Engineering
Arul Prakash Karaiyan
- Department of Applied Mechanics
B Ravindran
- Department of Computer Science and Engineering

ABOUT IIT MADRAS

R & D

RANKINGS & ACHIEVEMENTS

QUICK FIND