Natural sounding TTS based on syllable-like units

Hema Murthy; C. S. Ramalingam; Samuel Thomas; M. Nageshwara Rao

Profiles Research Units Publications

Conferences

Natural sounding TTS based on syllable-like units

, , Samuel Thomas, M. Nageshwara Rao

Published in IEEE

2006

Abstract

In this work we describe a new .syllable-like. speech unit that is suitable for concatenative speech synthesis. These units are automatically generated using a group delay based segmentation algorithm and acoustically correspond to the form C*VC* (C: consonant, V: vowel). The effectiveness of the unit is demonstrated by synthesizing natural-sounding speech in Tamil, a regional Indian language. Significant quality improvement is obtained if bisyllable units are also used, rather than just monosyllables, with results far superior to the traditional diphone-based approach. An important advantage of this approach is the elimination of prosody rules. Since f 0 is part of the target cost, the unit selection procedure chooses the best unit from among the many candidates. The naturalness of the synthesized speech demonstrates the effectiveness of this approach.