Concise Integer Linear Programming Formulations for Dependency Parsing

From HLT@INESC-ID

André Martins
André Martins
Andre Martins is a PhD student at the Language Technologies Institute within the School of Computer Science at Carnegie Mellon, and at Instituto de Telecomunicacoes at IST. He is being advised by Mario Figueiredo, Pedro Aguiar, Noah Smith and Eric Xing. His research interests are in kernel-based structured prediction applied to natural language processing.
Addresses: www mail

Date

  • 15:00, Friday, January 8th, 2010
  • Room 336

Speaker

  • André Martins, IST/Carnegie Mellon University

Abstract

We formulate the problem of non-projective dependency parsing as a polynomial-sized integer linear program. Our formulation is able to handle non-local output features in an efficient manner; not only is it compatible with prior knowledge encoded as hard constraints, it can also learn soft constraints from data. In particular, our model is able to learn correlations among neighboring arcs (siblings and grandparents), word valency, and tendencies toward nearly-projective parses.

The model parameters are learned in a max-margin framework by employing a linear programming relaxation. We evaluate the performance of our parser on data in several natural languages, achieving improvements over existing state-of-the-art methods.