Header menu link for other important links
Building speech synthesis systems for Indian languages
, Pradhan A., , Aswin Shanmugam S., Kasthuri G.R.,
Published in Institute of Electrical and Electronics Engineers Inc.
In this paper, new efforts to build text-to-speech synthesis systems (TTS) for Indian languages is presented. The synthesisers are built around both concatenative speech synthesis and statistical parametric speech synthesis frameworks. Text to speech synthesis systems require accurate segmentation. Obtaining accurate segmentation at the phone-level is a difficult task. Manual segmentation leads to human errors, while automatic segmentation using statistical approaches (hidden Markov model based approaches) leads to poor boundary information, when the amount of data used for training is small. A group delay based syllable segmentation semi-automatic tool is discussed. T he tool is semi-automatic as some of the boundaries obtained are inaccurate and have to be manually corrected. Next, a segmentation algorithm that uses both HMM based segmentation and group delay based segmentation, is used to obtain accurate boundaries automatically. The boundaries obtained are used in the syllable-based synthesiser for unit selection. In the statistical phone-based synthesiser, embedded reestimation is performed at the phone level. Syllable-based and penta-phone based HMMs are used for building the synthesiser. TTS systems for 12 different Indian languages namely Tamil, Hindi, Marathi, Malayalam, Telugu, Rajasthani, Bengali, Odia, Assamese, Manipuri, Kannada and Gujarati are built using semi-automatic segmentation and synthesisers have been built for 7 Indian languages using automatic segmentation. Evaluation of the semi-automatic segmentation systems indicate that the MOS (mean opinion score) is above 3.0 for most of the languages. Pair comparison tests on semi-automatic vs. automatic segmentation show that automatic segmentation is preferred. © 2015 IEEE.
About the journal
JournalData powered by Typeset2015 21st National Conference on Communications, NCC 2015
PublisherData powered by TypesetInstitute of Electrical and Electronics Engineers Inc.
Open AccessNo
Authors (3)