Recover Capitalization and Punctuation Marks on Speech Transcriptions


Revision as of 14:38, 19 May 2011 by Acbm (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Fernando Batista


  • 14:30, Wednesday, May 25th, 2011
  • Room 20



This presentation addresses two important metadata annotation tasks, involved in the production of rich transcripts: capitalization and recovery of punctuation marks. The main focus of this study concerns broadcast news, using both manual and automatic speech transcripts. Different capitalization models were analysed and compared, indicating that generative approaches capture the structure of written corpora better, while the discriminative approaches are suitable for dealing with speech transcripts, and are also more robust to ASR errors. The so-called language dynamics have been addressed, and results indicate that the capitalization performance is affected by the temporal distance between the training and testing data. In what concerns the punctuation task, this study covers the three most frequent marks: full stop, comma, and question mark. Early experiments addressed full-stop and comma recovery, using local features, and combining lexical and acoustic information. Recent experiments also combine prosodic information and extend this study to question marks.

Note: This seminar will be held in English.