A Lightweight on-the-fly Capitalization System for Automatic Speech Recognition
From HLT@INESC-ID
Date
- 15:00, September 21, 2007
- 3rd floor meeting room
Speaker
Abstract
This presentation describes a method for capitalizing speech transcriptions. Several resources were used, including a lexicon, newspaper written corpora and speech transcriptions. Different approaches were tested both generative and discriminative: finite state transducers, automatically built from Language Models; and maximum entropy models. Evaluation results are presented both for written newspaper corpora and speech transcriptions of broadcast news corpora.