Phone Recognition and Language Modeling for Variety Identification: Difference between revisions

From HLT@INESC-ID

No edit summary
 
No edit summary
 
(One intermediate revision by the same user not shown)
Line 3: Line 3:
|name= Oscar Koller
|name= Oscar Koller
|image=orcar_koller.jpg
|image=orcar_koller.jpg
|email=oscar.koller@googlemail.com
|email=oscar@l2f.inesc-id.pt
|www=http://www
|www=http://www
|bio=Oscar Koller, was born in Berlin, being fascinated by spoken and written languages. He is currently working at INESC ID’s Spoken Language Laboratory Lisboa on his master thesis "Automatic Speech Recognition and Identification of African Portuguese". To finish his Electrical Engineering studies at Berlin University of Technology, Germany. His research interests are communication centred: language variety recognition, languages without script, languages with pictographic script, sign languages. As languages are better learnt abroad, he spent significant time living in Brazil, Ghana, China and Portugal. He is fluent in German, English, Portuguese and French, has advanced knowledge of Mandarin and basics in American Sign Language.}}
|bio=Oscar Koller, was born in Berlin, being fascinated by spoken and written languages. He is currently working at INESC ID’s Spoken Language Laboratory Lisboa on his master thesis "Automatic Speech Recognition and Identification of African Portuguese". To finish his Electrical Engineering studies at Berlin University of Technology, Germany. His research interests are communication centred: language variety recognition, languages without script, languages with pictographic script, sign languages. As languages are better learnt abroad, he spent significant time living in Brazil, Ghana, China and Portugal. He is fluent in German, English, Portuguese and French, has advanced knowledge of Mandarin and basics in American Sign Language.}}
Line 9: Line 9:
== Date ==
== Date ==


* 14:00, Friday, January 29<sup>th</sup>, 2010
* 14:00, Friday, February 19<sup>th</sup>, 2010
* Room 336
* Room 336



Latest revision as of 14:28, 9 April 2010

Oscar Koller
Oscar Koller
Oscar Koller
Oscar Koller, was born in Berlin, being fascinated by spoken and written languages. He is currently working at INESC ID’s Spoken Language Laboratory Lisboa on his master thesis "Automatic Speech Recognition and Identification of African Portuguese". To finish his Electrical Engineering studies at Berlin University of Technology, Germany. His research interests are communication centred: language variety recognition, languages without script, languages with pictographic script, sign languages. As languages are better learnt abroad, he spent significant time living in Brazil, Ghana, China and Portugal. He is fluent in German, English, Portuguese and French, has advanced knowledge of Mandarin and basics in American Sign Language.
Addresses: www mail

Date

  • 14:00, Friday, February 19th, 2010
  • Room 336

Speaker

  • Oscar Koller, L2F

Abstract

This talk will introduce the phonotactic approach "Phone Recognition and Language Modeling" (PRLM) for language/variety identification. After a detailed view on this token based method, I will present the use of a specialized Phone Recognizer to differentiate African Portuguese from European Portuguese in a highly accurate way. In contrast to other PRLM based methods, the tokenizer combines distinctive knowledge about the differences between the target varieties. This knowledge is introduced into a MLP phone recognizer by training two varieties’ mono-phonemes as contrasting phoneme-like classes within a single tokenizer. Significant improvements in terms of identification rate and computational cost were achieved compared to conventional single tokenizer PRLM based systems and to the combination of up to five parallel PRLM identifiers.