Speaker Characterization with MLSFs (seminar)

From HLT@INESC-ID

Revision as of 19:42, 8 June 2006 by Lcoheur (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Date

  • June 16, 2006

Speaker

Abstract

The work described in this paper concerns the analysis of an alternative feature for speaker characterization, in the context of speaker recognition: Line Spectrum Frequencies (LSF), but derived from mel-filter bank energies. This new feature, that we denominate mel-LSFs (MLSFs), shows similar performance comparing to MFCCs for male speakers, one of the most common feature found in speaker recognition, but for female speakers MLSFs performs better than MFCCs. When combined with mel LSFs differences, MLSFs feature overcomes the performance of the MFCCs for male and female speakers, even with temporal delta, ΔMFCCs, included. Performance is measured in the context of speaker verification, using EER and minimum HTER. Detection error threshold (DET) curves are also presented, as well as HTER curves. The main objective of this study is to compare different features performances with a common framework, from what a standard support vector machine recogniser was developed. Tests are based on the cellular component of the “2002 NIST Speaker Recognition Evaluation Corpus