Biology Based Alignments of Paraphrases for Sentence Compression


Gaël Harry Dias
João Paulo Cordeiro
João Paulo Cordeiro

João Cordeiro graduated from the University of Beira Interior in Applied Matemathics, in 1998. He has a Masters Degree in Artificial Inteligence and Computer Science in Text Mining and Extraction, obtained in 2003, from the Faculty of Science - University of Porto (Portugal). Since January 2005 he is a PhD student in Computer Science in the specific area of Natural Language Processing, at the University of Beira Interior (Portugal).
He is an Assistant Lecturer in the Department of Computer Science of the University of Beira Interior, where he teach subjects related to Programming, Algorithms and Artificial Intelligence.

He is a researcher at the of Human Language Technology and Bioinformatics (HULTIG) of the University of Beira Interior and also a researcher at the NIAAD in Porto (Portugal).

He is a member of the ACL and a member of the Portuguese Association of Artificial Intelligence (APPIA).

Addresses: www mail


  • 11:00, Friday, June 22, 2007
  • 3rd floor meeting room



In this communication, we present a study for extracting and aligning paraphrases in the context of Sentence Compression. First, we justify the application of a new measure for the automatic extraction of paraphrase corpora. Second, we discuss the work done by (Barzilay & Lee, 2003) who use clustering of paraphrases to induce rewriting rules. We will see, through classical visualization methodologies (Kruskal &Wish, 1977) and exhaustive experiments, that clustering may not be the best approach for automatic pattern identification. Finally, we will provide some results of different biology based methodologies for pairwise paraphrase alignment.