Header menu link for other important links
X
Acoustic modelling for speech recognition in Indian languages in an agricultural commodities task domain
Published in Elsevier B.V.
2014
Volume: 56
   
Issue: 1
Pages: 167 - 180
Abstract
In developing speech recognition based services for any task domain, it is necessary to account for the support of an increasing number of languages over the life of the service. This paper considers a small vocabulary speech recognition task in multiple Indian languages. To configure a multi-lingual system in this task domain, an experimental study is presented using data from two linguistically similar languages - Hindi and Marathi. We do so by training a subspace Gaussian mixture model (SGMM) (Povey et al., 2011; Rose et al., 2011) under a multi-lingual scenario (Burget et al., 2010; Mohan et al., 2012a). Speech data was collected from the targeted user population to develop spoken dialogue systems in an agricultural commodities task domain for this experimental study. It is well known that acoustic, channel and environmental mismatch between data sets from multiple languages is an issue while building multi-lingual systems of this nature. As a result, we use a cross-corpus acoustic normalization procedure which is a variant of speaker adaptive training (SAT) (Mohan et al., 2012a). The resulting multi-lingual system provides the best speech recognition performance for both languages. Further, the effect of sharing "similar" context-dependent states from the Marathi language on the Hindi speech recognition performance is presented. © 2013 Elsevier B.V. All rights reserved.
About the journal
JournalData powered by TypesetSpeech Communication
PublisherData powered by TypesetElsevier B.V.
ISSN01676393
Open AccessNo
Concepts (18)
  •  related image
    Agriculture
  •  related image
    Aluminum
  •  related image
    Deep neural networks
  •  related image
    Gaussian distribution
  •  related image
    Linguistics
  •  related image
    Modeling languages
  •  related image
    Population statistics
  •  related image
    Speech
  •  related image
    Speech processing
  •  related image
    AGRICULTURAL COMMODITIES
  •  related image
    Automatic speech recognition
  •  related image
    SPEAKER ADAPTIVE TRAININGS
  •  related image
    SPEECH RECOGNITION PERFORMANCE
  •  related image
    SPOKEN DIALOGUE SYSTEM
  •  related image
    SUB-SPACE MODELLING
  •  related image
    SUBSPACE GAUSSIAN MIXTURE MODELS
  •  related image
    UNDER-RESOURCED LANGUAGES
  •  related image
    Speech recognition