Using Background Information to Improve Speech-to-Text Summarization: Difference between revisions
From HLT@INESC-ID
No edit summary |
|||
Line 23: | Line 23: | ||
difficulties in the accurate identification of sentence boundaries. We explore the use of automatically acquired background information to cope with the difficulties of summarizing spoken language. | difficulties in the accurate identification of sentence boundaries. We explore the use of automatically acquired background information to cope with the difficulties of summarizing spoken language. | ||
Two strategies of using this additional information are proposed and evaluated: topic-directed semantic spaces and mixed-source multi-document speech-to-text summarization. | Two strategies (based on latent semantic analysis) of using this additional information are proposed and evaluated: topic-directed semantic spaces and mixed-source multi-document speech-to-text summarization. | ||
[[category:Seminars]] | [[category:Seminars]] | ||
[[category:Seminars 2009]] | [[category:Seminars 2009]] |
Latest revision as of 19:31, 26 February 2009
Date
- 15:00, Friday, February 27th, 2009
- Room 336
Speaker
Abstract
Speech-to-text summarization systems usually take as input the output of an automatic speech recognition (ASR) system that is affected by issues like speech recognition errors, disfluencies, or difficulties in the accurate identification of sentence boundaries. We explore the use of automatically acquired background information to cope with the difficulties of summarizing spoken language.
Two strategies (based on latent semantic analysis) of using this additional information are proposed and evaluated: topic-directed semantic spaces and mixed-source multi-document speech-to-text summarization.