Speaker and content identification
From HLT@INESC-ID
| Xavier Anguera | 
|  | 
| Addresses: www mail | 
Date
- 15:00, Friday, April 13th, 2012
- Room 336
Speaker
- Xavier Anguera, Telefonica Research
Abstract
In this talk I will cover two of the topics I have been recently working on. On the one hand, with regard to speaker identification, I will introduce the use of binary fingerprints to model the voice of a speaker. Based on the projection of standard acoustic vectors into a special GMM model (representing the speaker acoustic space), high-dimensional binary vectors, which have been proven successful in identifying speakers for speaker verification and diarization tasks, are obtained. On the other hand, I will talk about current developments in pattern matching approaches that allow for the development of content-centric applications when little or no training data is available for a particular language. In particular, I will describe a query-by-example system I presented to Mediaeval 2011 evaluation, which uses a novel feature extraction front-end I further described in a paper at ICASSP 2012.
Note: This seminar will be held in English.