Difference between revisions of "Speech-to-speech Translation"

From HLT@INESC-ID

Line 10: Line 10:
 
| align="center" |<div style="border-style: solid; border-width: 0px; width: 100px;">[[Image:tipo-passe-joana.png|100px|center|]][[Joana Paulo Pardal]]</div>
 
| align="center" |<div style="border-style: solid; border-width: 0px; width: 100px;">[[Image:tipo-passe-joana.png|100px|center|]][[Joana Paulo Pardal]]</div>
 
|}
 
|}
 
+
__NOTOC__
 
Speech-to-speech machine translation is one of the most strategically relevant areas for L2F.
 
Speech-to-speech machine translation is one of the most strategically relevant areas for L2F.
 
The state of the art in speech translation is crucially dependent on the state of the art of several core technologies: speech recognition, machine translation and, to a lesser extent, text-to-speech synthesis (namely in what concerns voice morphing, in order to reproduce the source speakers’ characteristics in the target speaker’s voice). The main limitations of current machine translation systems are the lack of semantic interpretation and world knowledge as well as insufficient coverage of the large proportion of idiosyncratic linguistic phenomena in lexicon and syntax. The most promising approaches combine improved statistical methods with the improved knowledge-driven methods in a variety of clever ways.
 
The state of the art in speech translation is crucially dependent on the state of the art of several core technologies: speech recognition, machine translation and, to a lesser extent, text-to-speech synthesis (namely in what concerns voice morphing, in order to reproduce the source speakers’ characteristics in the target speaker’s voice). The main limitations of current machine translation systems are the lack of semantic interpretation and world knowledge as well as insufficient coverage of the large proportion of idiosyncratic linguistic phenomena in lexicon and syntax. The most promising approaches combine improved statistical methods with the improved knowledge-driven methods in a variety of clever ways.
Line 24: Line 24:
 
== Related Software ==
 
== Related Software ==
  
* Constrained Alignment Toolkit (CAT) - Word Alignment Toolkit produced in cooperation with the University of Pennsylvania. Please see official web site [[http://www.seas.upenn.edu/~strctlrn/CAT/CAT.html]]
+
* Constrained Alignment Toolkit (CAT) - Word Alignment Toolkit produced in cooperation with the University of Pennsylvania. Please see [http://www.seas.upenn.edu/~strctlrn/CAT/CAT.html official web site]
  
 
== Demos ==
 
== Demos ==
  
* See demos page [[http://www.l2f.inesc-id.pt/wiki/index.php/WFST_-_Weighted_Finite_State_Transducers_Applied_to_Spoken_Language_Processing#Demos]]
+
* See [[WFST (Weighted Finite State Transducers Applied to Spoken Language Processing)|WFST]] [http://www.l2f.inesc-id.pt/wiki/index.php/WFST_-_Weighted_Finite_State_Transducers_Applied_to_Spoken_Language_Processing#Demos demos page]
  
 
== Finished Projects ==
 
== Finished Projects ==

Revision as of 14:29, 1 July 2008

People

Speech-to-speech machine translation is one of the most strategically relevant areas for L2F. The state of the art in speech translation is crucially dependent on the state of the art of several core technologies: speech recognition, machine translation and, to a lesser extent, text-to-speech synthesis (namely in what concerns voice morphing, in order to reproduce the source speakers’ characteristics in the target speaker’s voice). The main limitations of current machine translation systems are the lack of semantic interpretation and world knowledge as well as insufficient coverage of the large proportion of idiosyncratic linguistic phenomena in lexicon and syntax. The most promising approaches combine improved statistical methods with the improved knowledge-driven methods in a variety of clever ways.

L2F has been investing in statistically based speech-to-speech machine translation approaches based on weighted finite state transducers [Picó 2005] [Caseiro 2006], aiming at a tight integration between recognition and translation. WFSTs are especially well suited for combining different type of approaches, whether statistical or knowledge-based. The combination may be advantageous for achieving two different goals (i) include morpho-syntactic linguistic knowledge into the statistical machine translation paradigm and (ii) tackle the data sparseness problem for speech translation.

This research is carried out within the scope of a national project on “Weighted Finite State Transducers Applied to Spoken Language Processing”. Two PhD theses have recently started in this area.

Related Resources

Related Software

  • Constrained Alignment Toolkit (CAT) - Word Alignment Toolkit produced in cooperation with the University of Pennsylvania. Please see official web site

Demos

Finished Projects

  • WFST - Weighted Finite State Transducers Applied to Spoken Language Processing (2004-2007)

Selected Publications

  • João de Almeida Varelas Graça, Diamantino António Caseiro, Luísa Coheur, The INESC-ID IWSLT07 SMT System, In Proceedings of IWSLT International Workshop on Spoken Language Translation, pages 125-130, October 2007 (slides pdf)
  • Diamantino António Caseiro, Isabel Trancoso, Weighted Finite-State Transducer Inference for Limited-Domain Speech-to-Speech Translation, In Computational Processing of the Portuguese Language: 7th International Workshop, PROPOR 2006, Springer, pages 60 - 68, May 2006
  • D. Picó, J. González, F. Casacuberta, Diamantino António Caseiro, Isabel Trancoso, Finite-state transducer inference for a speech-input Portuguese-to-English machine translation system, In Interspeech 2005, September 2005