PT-STAR (Speech Translation Advanced Research to and from Portuguese): Difference between revisions

Latest revision as of 17:13, 3 June 2020

Funded by: FCT (Carnegie Mellon | Portugal Research Projects Program) [CMU-PT/HuMach/0039/2008]
Start date: 01 May 2009
Duration: 36 months

Partners

INESC-ID – Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa, Portugal
Carnegie Mellon University (CMU)
Universidade da Beira Interior (UBI)
Fundação da Universidade de Lisboa (FUL/UL)

Summary

The main goal of PT-STAR is the improvement of the current S2SMT systems from Portuguese to English and vice-versa. A third language (Chinese) will be the target of a PhD thesis on machine translation, jointly supervised by Professors from L2F and the University of Macau. The informal cooperation of this University in the framework of the current proposal will therefore contribute to enhance its scope, encompassing typologically different languages.

INESC-ID main researchers

Description

S2SMT can be seen as a cascade of three major components: Automatic Speech Recognition (ASR), Machine Translation (MT) and Text-to-Speech Synthesis (TTS). One of the main problems of this multidisciplinary area, however, is the still weak integration between the three components. The main goal of this project is to improve speech translation systems for Portuguese by strengthening this integration. Hence the project encompasses three main tasks. The first one addresses the interface ASR/MT (Task T1); the second one addresses the interface MT/TTS (Task T2); and the third one the MT module itself (Task T3). A fourth task will build a proof of concept prototype (Task T4). One of the main goals of the first task is the translation of spontaneous speech, for which the performance of the ASR component seriously degrades added by an even greater impact in the translation system. Tightening the integration between the two modules is another major goal that can be approached in several ways, either by feeding the confusion networks produced by the ASR module to the MT module, or by investigating how the segmentation of the input speech affects its translation, as the phrase segmentation of the input is a critical issue in phrase-based translation. A third goal of this task is the use of a PDA-interface for speech translation. In what concerns the second task, one of the major problems to be addressed is the fact that the current output of the MT module is totally unsuited as input to the TTS module, being often non-grammatical, and producing non-understandable speech, as a consequence. Another major subtask is voice conversion, which allows the synthesized speech to retain the characteristics of the original voice, making it very useful for a wide range of S2SMT applications. The third task addresses major problems in statistical machine translation: the study of different methods to automatically build aligned parallel corpora from non-aligned ones, the updating of the translation model, and the use of fully supervised, semi-supervised and completely unsupervised approaches for adapting the system, using actual user results. Finally, the fourth task targets at implementing a proof of concept prototype. The goal of the project is not building a complete speech-to-speech translation system itself, which involves many engineering issues, but rather invest in the core research necessary to strengthen this area, thus providing an umbrella project for the PhD theses that will be devoted to this topic in the framework of the CMU-Portugal program.

Demo

S2S Demo

Revision as of 14:41, 8 November 2010 (view source) David (talk \| contribs) No edit summary ← Older edit		Latest revision as of 17:13, 3 June 2020 (view source) Jcruz (talk \| contribs) m (Update video URL)
(One intermediate revision by one other user not shown)
Line 40:		Line 40:
	== Demo ==		== Demo ==

	* [~~http~~://www.~~l2f~~.~~inesc-id.pt/~tmcl/pt-star~~/~~pt-star-demo.zip download~~]		* [https://www.youtube.com/watch?v=sP9uCBfSlIY S2S Demo]