Modeling the F0 curve for Speech Synthesis

From HLT@INESC-ID

Gopala Krishna Anumanchipalli
Gopala Krishna Anumanchipalli
Gopala Krishna Anumanchipalli
Gopala is a PhD student in the LTI, CMU and INESC-ID Lisboa, IST. He is advised by Dr. Alan W Black and Dr. Luis Oliveira. He is currently at INESC-ID. He is interested broadly in everything to do with language, but specifically works on building models and transformation approaches for prosody in Speech synthesis. He is working in the PT-Star project aiming to do Speech-to-Speech machine translation of video lectures.
Addresses: www mail

Date

  • 15:00, Friday, September 3rd, 2010
  • Room 336

Speaker

  • Gopala K Anumanchipalli, Carnegie Mellon University, USA and INESC-ID Lisboa, IST

Abstract

In this talk I will review some approaches used for modeling the Fundamental Frequency, the F0 contour. I will detail the F0 modeling strategy currently used in Clustergen, CMU's statistical parametric Synthesis framework. I will describe our recent work attempting to improve the baseline modeling strategy by incorporation of longer range (Syllable and Phrase) features into the F0 model. We use the TILT model of Intonation for this work, and I will briefly describe the tilt framework and mention the alternative frameworks used in Intonation modeling.


Note: This seminar will be held in English.