Revision as of 09:52, 5 January 2015 by Imt (talk | contribs)



DIRHA (2012-2015)
The DIRHA project addresses the development of voice-enabled automated home environments based on distant-speech interaction in different languages. A distributed microphone network is installed in the rooms of a house in order to monitor selectively acoustic and speech activities observable inside any space, and to eventually run a spoken dialogue session with a given user in order to implement a service or to have access to appliances and other devices. [1]
SPEDIAL (2013-2015)
The main S&T goal of SpeDial is to devise machine-aided methods for spoken dialogue system enhancement and customization for call-center applications. [2]
RAGE (2015-2018)
The overall objective of RAGE is to develop, transform and enrich advanced technologies from the leisure games industry into self-contained gaming assets (i.e. solutions showing economic value potential) that support game studios at developing applied games, and make these assets available along with a large volume of high-quality knowledge resources through a self-sustainable Ecosystem, which is a social space that connects research, gaming industries, intermediaries, education providers, policy makers and end-users.
COST Action IS1312 (2013-2017)
TextLink: Structuring Discourse in Multilingual Europe. TextLink will unify numerous but scattered linguistic resources on discourse structure. With its resources searchable by form and/or meaning and a source of valuable correspondences, TextLink will enhance the experience and performance of human translators, lexicographers, language technology and language learners alike. [3]
COST Action IC1206 (2013-2017)
De-identification in multimedia content can be defined as the process of concealing the identities of individuals captured in a given set of data (images, video, audio, text), for the purpose of protecting their privacy. This Action aims to facilitate coordinated interdisciplinary efforts (related to scientific, legal, ethical and societal aspects) in the introduction of person de-identification and reversible de-identification in multimedia content. [4]
COST Action IC1307 (2013-2017)
Combining Computer Vision and Language Processing For Advanced Search, Retrieval, Annotation and Description of Visual Data [5]


SUSPECT (2012-2015)
The goal of this project is to develop privacy-preserving frameworks for processing voice data. Processing will be performed without having access to the voice, i.e., access to any form of the speech that can be analyzed to obtain information about the talker or what they spoke. Using a combination of tools from cryptography and secure-multiparty computation we will render voice processing algorithms secure, so that the privacy of all parties is preserved. We specifically propose to develop solutions for secure speaker verification and keyword spotting problems.
VOCE (2012-2015)
VOCE shall develop methods and algorithms that enable the online classification of stress from live speech with the goal of providing feedback cues to the speaker in real-time to improve his communication skills. The work will focus on detecting and classifying stress in speech by leveraging advanced signal processing and machine learning techniques, complemented with psychological analysis of different aspects of stress perception [6]
MISNIS (2013-2015)
Develop an analysis framework that can help to determine the effect of social networks in society. The intended framework will enable the identification and traction of important events (topics) and key actors within those topics, as well as identify their origin and propagation timeline. Measures and indicators to characterize events will be proposed by the research team and provided by the framework contributing with essential information to understand the social network phenomena and its importance in current world society.
IC4U (2013-2015)
In this project, we will apply advanced computational and modelling strategies to clinical data sets (including both instrumentation data and text medical notes) in an effort to. 1) identify important factors or features that lead to unfavourable or favourable clinical conditions that determine if a patient should be released or not from that ICU with minimum risk of readmission; 2) Design specific decision models that will support clinician’s decisions in terms of releasing a patient form the ICU, in order to achieve more favourable clinical outcomes.
INSIDE (2014-2018)
INSIDE explores the scientific and technological challenges involved in developing symbiotic human-robot interactions in the context of a physical game involving children. To this goal, INSIDE investigates how to extend multiagent planning under uncertainty to cooperative scenarios involving human and robot agents. INSIDE also investigates natural forms of interaction (including spoken interactions) and how contextual and environment information (collected by a sensor network) can improve the communication between robot and human agents during joint collaborative tasks involving activities in a physical environment. The outcomes of INSIDE will be deployed in “Centro para o Desenvolvimento da Criança” (CDC, Child Development Center) of Hospital Garcia de Orta, in a physical game specially designed by the INSIDE team. [7]
TRATAHI (2015)
The goal of this exploratory project, in cooperation with Carnegie Mellon University, is to design a web-based tool for man-in-the-loop joint robust Automatic Speech Recognition (ASR) and Machine Translation (MT). For this purpose research will be centered on representing the uncertainty of ASR and MT outputs expressed as a lattice, incorporating user feedback to the lattice and sharing information among MT and ASR lattices for improved performance.[8]
MT4M (2015)
This exploratory project in cooperation with Carnegie Mellon University aims to develop state of the art technologies for Machine Translation in Microblogs, such as Twitter and Facebook. Therefore, methods must be developed in the scope of the open position to address the unpredictability and originality of text in this environment. The main tasks address normalization techniques to convert the input into more standard forms, the usage of continuous space models that are more robust to word variations and the development of MT evaluation techniques suited to this domain. [9]



  • euTV - Adaptive Channels in Europe (2010-2012)
  • METANET4U - Network of Excellence forging the multilingual Europe Technology Alliance (2010-2012)
  • LIREC - LIving with Robots and InteractivE Companions (2008-2012)
  • I-DASH - The Investigator’s Dashboard (2008-2010)
  • VIDIVIDEO - Interactive semantic video search with a large thesaurus of machine learned audio-visual concepts(2007-2010)
  • COST-2102 - Cross-Modal Analysis of Verbal and Non-verbal Communication (2006-2010)
  • COST-2103 - Advanced Voice Function Assessment (2006-2010)
  • E-Circus - Education through Characters with Emotional Intelligence and Role-playing Capabilities that Understand Social Interaction (2006-2008)
  • COST 277 - Nonlinear Speech Processing (2001-2005)
  • COST 278 - Spoken Language Interaction in Telecommunication (2001-2005)
  • ALERT - Alert System for Selective Dissemination of Multimedia Information (2000-2002)
  • AUDIOLING-LP - Multimedia course for foreign students of the Portuguese language (Socrates Program)
  • SPEECHDAT - Speech Databases for for Creation of Voice Driven Teleservices (1994-1999)
  • VODIS - Advanced Speech Technologies for Voice Operated Driver Information Systems (1995-1999)
  • SPRACH - Speech Recognition ALgorithms for Connectionist Hybrids (1995-1998)
  • ELSNET - European Network in Language and Speech
  • ECESS - European Center of Excellence on Speech Synthesis


  • VITHEA - Virtual Therapist for Aphasia treatment (2010-2012)
  • AVOZ - Models for automatic speech recognition for the Elderly (2012-2013)
  • [PoSTPort] POrting Speech Technologies to other varieties of Portuguese (2008-2010)
  • OBIAN Lexical and grammatical resources for recognizing syntactic-semantic relationships on text (2010-2012)
  • [FalaComigo] - Enhance the Cultural Tourism through the Interaction with Virtual Characters (2010-2013)
  • REAP.PT - Computer Assisted Language Learning: Reading Practice (2009-2012)
  • PT-STAR - Speech Translation Advanced Research to and from Portuguese (2009-2012)
  • ARIA - Ambient-assisted Reading Interfaces for the Ageing-society (2010-2012)
  • FleetMod - FleetMod (simulate and predict the behavior of the skippers of fishing vessels to provide a framework to test the effectiveness of different management policies) (2008-2011)
  • StopFire - StopFire (a distributed intelligent system for forest fire combat aid) (2007-2011)
  • Tecnovoz - Tecnologia de Reconhecimento e Síntese de Voz (2006-2008)
  • RiCoBa - Rich Content Books for All (2005-2007)
  • LECTRA - Rich Transcription of Lectures for E-Learning Applications (2005-2007)
  • NLE GRID - Natural Language Engineering on a Computational Grid (2005-2007)
  • WFST - Weighted Finite State Transducers Applied to Spoken Language Processing (2004-2007)
  • DIGA - Dialog Interface for Global Access (2004-2007)
  • PAPOUS - The Story Teller (2003-2005)
  • IPSOM - Indexing, Integration and Sound Retrieval in Multimedia Documents (2000-2004)
  • ATA - Automatic Terms Acquisition (2001-2003)
  • FALA2 (2000-2003)
  • CITE-IV - Augmentative Communication Tools in Portuguese (1999-2000)
  • DIXI+ - A Text-to-Speech Synthesizer in Portuguese for Alternative and Augmentative Communication (1999-2001)
  • REC - Speech Recognition Applied to Telecommunications (1997-2000)
  • PRAXIS_FALA - Reconhecimento de fala de Alto Desempenho em Português (1997-1999)
  • CORAL - Labelled Spoken Dialogue Corpus (1997-1999)
  • BDFALA (in Portuguese) - Spoken Database for European Portuguese (1994-1998)
  • EDIFALA (in Portuguese) - Vocal Support System for Oral and Motor Handicapped (1993-1997)

Bilateral contracts

  • LIFAPOR - Spoken Books in European and Brazilian Portuguese (2005-2007)
  • ARARA - Automatic directory assistant service for Portuguese Telecom, together with Philips Speech Techonology (1999-2001)
  • SVIT (in Portuguese) - Partial Automation of Directory Services Based on Synthesis of Telephone Numbers (1995-1999)

Finished before 1995

  • WERNICKE - A Neural Network Based, Speaker Independent, Large Vocabulary, Continuous Speech Recognition System (1992-1995)
  • RELATOR - A European Network of Repositories for Linguistic Resources (1993-1994)
  • SAM_A (ESPRIT III) - Multi-Lingual Speech Input/Output Assessment, Methodology and Standardization (1992-1993)
  • ONOMASTICA (Language Research Engineering) - Multi-Language Pronounciation Dictionary of Proper Names and Place Names (1993-1995)
  • SUNSTAR (ESPRIT II) - Integration and Design of Speech Understanding Interfaces (1989-1992)
  • HCM-ELSNET (Human Capital Mobility) - Phrase Level Phonology and Dialogue & Discourse (1994-1996)
  • EUREKA 151 - High quality speech coding at medium-to-low bit rates (1987-1990)
  • COST 229 - Applications od Digital Signal Processing to Telecommunications (1990-1993)
  • COMETT - A Trans-European Platform for Transferable Continuing Education in Digital Signal Processing (1990-1993)