A Lightweight on-the-fly Capitalization System for Automatic Speech Recognition

From HLT@INESC-ID

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Fernando Batista
Fernando Batista

Date

  • 15:00, September 21, 2007
  • 3rd floor meeting room

Speaker

Abstract

This presentation describes a method for capitalizing speech transcriptions. Several resources were used, including a lexicon, newspaper written corpora and speech transcriptions. Different approaches were tested both generative and discriminative: finite state transducers, automatically built from Language Models; and maximum entropy models. Evaluation results are presented both for written newspaper corpora and speech transcriptions of broadcast news corpora.