Extractive Summarization of Broadcast News: Comparing Strategies for European Portuguese

From HLT@INESC-ID

Ricardo Daniel Ribeiro

Date

  • 15:00, September 14, 2007
  • 3rd floor meeting room

Speaker

Abstract

This work compares three methods for extractive summarization of Portuguese broadcast news: feature-based, Maximal Marginal Relevance, and Latent Semantic Analysis. The main goal is to understand the level of agreement among the automatic summaries and how they compare to summaries produced by non-professional human summarizers. Results were evaluated using the ROUGE-L metric. Maximal Marginal Relevance performed close to human summarizers. Both feature-based and Latent Semantic Analysis automatic summarizers performed close to each other and worse than Maximal Marginal Relevance, when compared to the summaries done by the human summarizers.