Using Background Information to Improve Speech-to-Text Summarization

From HLT@INESC-ID

Revision as of 19:31, 26 February 2009 by Rdmr (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Ricardo Daniel Ribeiro
Ricardo Daniel Ribeiro

Date

  • 15:00, Friday, February 27th, 2009
  • Room 336

Speaker

Abstract

Speech-to-text summarization systems usually take as input the output of an automatic speech recognition (ASR) system that is affected by issues like speech recognition errors, disfluencies, or difficulties in the accurate identification of sentence boundaries. We explore the use of automatically acquired background information to cope with the difficulties of summarizing spoken language.

Two strategies (based on latent semantic analysis) of using this additional information are proposed and evaluated: topic-directed semantic spaces and mixed-source multi-document speech-to-text summarization.