Generation of Word Alternative Pronunciations Using Weighted Finite State Transducers (seminar)


Revision as of 17:39, 10 July 2006 by David (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Sérgio Paulo


  • July 01, 2005



This paper describes a speech segmentation tool allowing alternative word pronunciations within a WFST framework. Two approaches to word pronunciation graph generation were developed and evaluated. The first approach is grapheme-based where each grapheme is converted into all the phones it can give rise to, in the form of a WFST. Word graphs are obtained by concatenating all grapheme WFSTs. In the second approach, a training corpus is used to find the different realizations of the syllable. This information is used to generate alternative syllable-level pronunciations, represented as WFSTs, that are concatenated to produce the word graphs. Both approaches were evaluated by aligning the phone sequence generated by each approach with the manually labelled phone sequence for all utterance in the corpus. This alignment was used for computing F-rate values for each phone. The syllable-based approach produced the best results.

See Also