PoSTPort (POrting Speech Technologies to other varieties of Portuguese)


Revision as of 10:43, 2 February 2010 by Imt (talk | contribs) (→‎Publications)
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
POSTPORT cplp.jpg

Sponsored by: FCT (PTDC/PLP/72404/2006)
Start: January 2008
End: December 2010


[Open Research Position in project PoSTPort - Porting Speech Technologies to other Varieties of Portuguese] (2009)


Project Leader: Isabel Trancoso

Undergraduate Students:


The goal of this project is porting spoken language technologies originally developed for European Portuguese to other varieties of Portuguese, namely those spoken in South-American and African countries. The two main spoken language technologies that will be investigated are text-to-speech and speech-to-text conversion, commonly known as synthesis and recognition. Instead of porting complete systems, we shall concentrate on the linguistically relevant modules, that are likely to be most affected. The main task of this proposal, therefore, is structured in terms of modules which will be ported and evaluated independently as much as possible. Prior to this main work, however, the project will involve two tasks. In the first one, we shall build the pilot corpus necessary for studying the main linguistic differences and porting to the new varieties. In the second task, we shall characterize the main differences between the studied varieties, starting with the orthographic and syntactic levels, and ending with the phonetic and phonological levels. As an outcome of this task, we shall attempt to define a phonetic alphabet that encompasses all varieties of Portuguese. The fourth task concerns the automatic identification of spoken varieties of Portuguese. This is an important goal by itself and also relevant as a pre-processing stage for switching among recognition systems developed for specific varieties. We plan to implement this identification module by exploring different types of cues, such as phonotactic, acoustic and prosodic ones. The four tasks of the proposal combine linguistic, signal processing and natural language processing knowledge, thus reflecting the importance of an interdisciplinary team.


  • T1 - Pilot Corpus collection
  • T2 - Characterization of main differences
  • T3 - Porting spoken language modules
  • T4 - Automatic identification of spoken varieties
  • T5 - Management

== Publications


Jean-Luc Rouas, Isabel Trancoso, Céu Viana, Mónica Abreu, Language and variety verification on broadcast news for Portuguese, Speech Communication, Elsevier, vol. 50, n. 11, November 2008

Amália Mendes, A. Estrela (accepted for publication). Constructions with

SE in African varieties of Portuguese, Phrasis, Ghent, Academia Press, pp.

Invited talks:

Isabel Trancoso, A Broadcast News Processing Chain for Several Varieties of Portuguese, In XXVII Simpósio Brasileiro de Telecomunicações, Brazilian Telecommunications Society, Blumenau, Brazil, September 2009

International conferences:

Alberto Abad, Isabel Trancoso, The L2F language verification systems for Albayzin-08 evaluation, In V Jornadas en Tecnología del Habla, November 2008

Patrick Silva, Nelson Neto, Aldebaro Klautau, André Adami, Isabel Trancoso, Speech Recognition for Brazilian Portuguese using the Spoltech and OGI-22 Corpora, In XXVI SIMPOSIO BRASILEIRO DE TELECOMUNICACOES, Rio de Janeiro, Brazil, September 2008

Jean-Luc Rouas, Isabel Trancoso, Céu Viana, Mónica Abreu, Portuguese Variety Identification on Broadcast News, In ICASSP 2008 - Int. Conf. on Acoustics, Speech, and Signal Processing, IEEE, Las Vegas, USA, April 2008

Fernando Batista, Isabel Trancoso, Nuno Mamede, Automatic Punctuation and Capitalization for Iberian Languages, In I Joint SIG-IL/Microsoft Workshop

 on Speech and Language Technologies for Iberian Languages, Porto Salvo,
 Portugal, September 2009 (Best Student Paper Award)

Plínio Barbosa, Céu Viana, Isabel Trancoso, [ Cross-variety Rhythm Typology in

 Portuguese], In Interspeech 2009, Brighton, UK, September 2009

Helena Moniz, Isabel Trancoso, A.I. Mata da Silva, [ Classification of disfluent

 phenomena as fluent communicative devices in specific prosodic contexts], In
 Interspeech 2009, ISCA, pages 1719-1722, Brighton, UK, September 2009

Alberto Abad, Isabel Trancoso, Nelson Neto, Céu Viana, [ Porting an European

 Portuguese Broadcast News Recognition System to Brazilian Portuguese], In
 Interspeech 2009, ISCA, Brighton, UK, September 2009

Fernando Batista, Isabel Trancoso, Nuno Mamede, Comparing Automatic Rich Transcription for Portuguese, Spanish and English Broadcast News, In ASRU - IEEE Automatic Speech Recognition and Understanding Workshop, Merano, Italy, December 2009


This demo shows our current Broadcast News recognizer for Brazilian Portuguese.