Detecting mis-recognitions in ASR output

Thomas Pellegrini

Date

15:00, Friday, October 8^th, 2010
Room 336

Speaker

Thomas Pellegrini

Abstract

Detecting incorrect words in automatic transcriptions can be useful for many applications: to mark or discard low-confidence words in automatic news subtitles or transcriptions, to select unsupervised material to train acoustic models, etc. In this talk, I will report experiments where various statistical classifiers were compared: a baseline Maximum Entropy approach, Conditional Random Fields, and a Markov Chain approach. New features gathered from other knowledge sources than the decoder itself were explored: a binary feature that compares outputs from two different ASR systems (word by word), a feature based on the number of hits of the hypothesized bigrams, obtained by queries entered into a very popular Web search engine, and finally a feature related to automatically infered topics at sentence and word levels. A classification error rate improvement from 13.9% to 12.1% was achieved. Experiments were conducted on a European Portuguese and an American English broadcast news corpus.

Note: This seminar will be held in English, if required.

Detecting mis-recognitions in ASR output

From HLT@INESC-ID

Date

Speaker

Abstract