Modeling Accent Groups for Improved Prosody in TTS

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Gopala Krishna Anumanchipalli

Date

15:00, Friday, February 8^th, 2013
Room 336

Speaker

Gopala Krishna Anumanchipalli

Abstract

In this talk, I will present an ‘Accent Group’ based intonation model for statistical parametric speech synthesis. We propose an approach to automatically model phonetic realizations of fundamental frequency (F0) contours as a sequence of intonational events anchored to a group of syllables (an Accent Group). We train an accent grouping model specific to that of the speaker, using a stochastic context free grammar and contextual decision trees on the syllables. This model is used to ‘parse’ an unseen text into its constituent accent groups over each of which appropriate intonation is predicted. The performance of the model is shown objectively and subjectively on a variety of prosodically diverse tasks - read speech, news broadcast and audio books.

Modeling Accent Groups for Improved Prosody in TTS

From HLT@INESC-ID

Date

Speaker

Abstract