Detecting mis-recognitions in ASR output


Revision as of 16:07, 23 December 2010 by Acbm (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Thomas Pellegrini


  • 15:00, Friday, October 8th, 2010
  • Room 336



Detecting incorrect words in automatic transcriptions can be useful for many applications: to mark or discard low-confidence words in automatic news subtitles or transcriptions, to select unsupervised material to train acoustic models, etc. In this talk, I will report experiments where various statistical classifiers were compared: a baseline Maximum Entropy approach, Conditional Random Fields, and a Markov Chain approach. New features gathered from other knowledge sources than the decoder itself were explored: a binary feature that compares outputs from two different ASR systems (word by word), a feature based on the number of hits of the hypothesized bigrams, obtained by queries entered into a very popular Web search engine, and finally a feature related to automatically infered topics at sentence and word levels. A classification error rate improvement from 13.9% to 12.1% was achieved. Experiments were conducted on a European Portuguese and an American English broadcast news corpus.

Note: This seminar will be held in English, if required.