Header menu link for other important links
X
Warping functions in speech
, Cohen L., Nelson D.
Published in SPIE
1998
Volume: 3458
   
Pages: 194 - 209
Abstract
We describe experiments that we have performed that address the issue of the relation between the same enunciations by different speakers. Our previous work indicated that frequencies are approximately scaled uniformly. In this paper we report results addressing possible corrections to uniform scaling. Our results show that the scaling is non uniform, that is the formant frequencies of different speakers scale differently at different frequencies. We discuss how this leads to the mathematical issue of separating the spectrum into a speaker dependent and speaker independent parts. We introduce the concept of a universal scaling function that is aimed at achieving this separation. The fundamental idea is to find a frequency axis transformation (warping function) which transforms the energy density spectrum (the squared absolute value of the Fourier transform of the enunciation) in such a way that a further Fourier transform of the resulting function achieves this separation. We discuss this procedure and relate it to the scale transform. Using real speech data we obtain such a transformation function. The resulting function is very similar to the Mel scale, which has been previously obtained only from psychoacoustic (hearing based) experiments. That similar scales are obtained from both hearing and speech production (as reported here) is fundamental to the understanding of speech and hearing.
About the journal
JournalData powered by TypesetProceedings of SPIE - The International Society for Optical Engineering
PublisherData powered by TypesetSPIE
ISSN0277786X
Open AccessNo