Word alignment based on Maximum Matching Method

From HLT@INESC-ID

Liang Tian
Liang Tian
I received my B. Sc. degree in computer science and technology from the Harbin Institute of Technology, Hei LongJiang, China, in June, 2009. Since October, 2009, I joined the Natural language processing & Portuguese-Chinese machine translation (NLP2CT) laboratory in the University of Macau, where I am working toward the PH.D. in the area of software engineering in this September 2013. My main research interests focus on statistical machine translation.
Addresses: www mail

Date

  • 15:00, Monday, July 22nd, 2013
  • Room 020, INESC-ID

Speaker

  • Liang Tian, NLP2CT, University of Macau

Abstract

GIZA++, which is to obtain the word alignment, has been the most popular used toolkit in statistical machine translation research field. To improve the quality of word alignment and the lexical translation performance, an word alignment approach based on Maximum Matching Method (MMM) and similarity is proposed. The algorithm applied in this presentation initially allows the words and phrases of parallel English and Chinese sentences are detected based on MMM, and then candidate alignment results are gotten by the constraint of both a dictionary and statistical GIZA++ results. Empirical study demonstrates that the proposed method gives a better alignment result than that of the GIZA++. Recently, one online system based on the word alignment method and Moses decoder has been developed, which is demonstrated in http://nlp2ct.sftw.umac.mo/MT/.