Hierarchical Phrase-based Translation with Weighted Finite-State Transducers
From HLT@INESC-ID
Adrià de Gispert |
![]() In 2007 he took a research position at the University of Cambridge, where he has worked in the AGILE project of the DARPA-funded GALE program, developing various SMT systems (ngram-based, phrase-based, hierarchical phrase-based) and participating successfully in multiple international evaluation campaigns for several language pairs and conditions (IWSLT 2004-2006 tasks, ACL-WMT 2005-2010 shared tasks, NIST 2006-2009 evaluation), including the translation of automatic speech recognition outputs. His current research interests are: large-scale statistical translation systems with little pruning, grammar induction, morphology-aware models, distributed models, reordering models and MT system combination, among others. |
Addresses: www mail |
Date
- 14:10, Tuesday, July 20th, 2010
- Room FA1 - Pavilhão de Informática I (IST)
Speaker
- Adrià de Gispert, Department of Engineering, University of Cambridge
Abstract
In this talk I will describe HiFST, the Cambridge University Engineering Department statistical machine translation system. I will review hierarchical-phrase based translation, and explain why an implementation based on Weighted Finite-State Transducers is convenient to avoid search errors in decoding.
I will then give an overview of the current research lines of our SMT group, focused on defining appropriate translation grammars, exploiting the vast number of alternative hypotheses encoded in translation lattices via restoring with large-scale language models, and decoding under Minimum Bayes Risk for system combination.