Projects: Difference between revisions

From HLT@INESC-ID

No edit summary
No edit summary
Line 114: Line 114:
| title=SUSPECT (SecUre SPEeCh Technologies)
| title=SUSPECT (SecUre SPEeCh Technologies)
| date=2012-2015
| date=2012-2015
| information=  Privacy-preserving frameworks for processing voice data}}
| information=  The goal of this project is to develop privacy-preserving frameworks for processing voice data. Processing will be performed without having access to the voice, i.e., access to any form of the speech that can be analyzed to obtain information about the talker or what they spoke. Using a combination of tools from cryptography and secure-multiparty computation we will render voice processing algorithms secure, so that the privacy of all parties is preserved. We specifically propose to develop solutions for secure speaker verification and keyword spotting problems. We envision a mechanism whereby police/security agencies could obtain publicly- accepted forms of legal sanction to look for pre-specified voices or phrases. The privacy-preserving framework will ensure that they are only notified when these occur, but will have no access to the voice data itself, thus preserving citizens’ privacy.}}
|}
|}



Revision as of 11:35, 25 February 2013

Ongoing

International

DIRHA (2012-2014)
The DIRHA project addresses the development of voice-enabled automated home environments based on distant-speech interaction in different languages. A distributed microphone network is installed in the rooms of a house in order to monitor selectively acoustic and speech activities observable inside any space, and to eventually run a spoken dialogue session with a given user in order to implement a service or to have access to appliances and other devices. [1]
ICT COST Action IC1206 (2013-2017)
De-identification in multimedia content can be defined as the process of concealing the identities of individuals captured in a given set of data (images, video, audio, text), for the purpose of protecting their privacy.

This Action aims to facilitate coordinated interdisciplinary efforts (related to scientific, legal, ethical and societal aspects) in the introduction of person de-identification and reversible de-identification in multimedia content. [2]

euTV (2010-2012)
EUTV plans to research, develop, and deploy the best-of-class extractors for indexing and analysing individual information modalities. It also plans to perform research on multimodal feature fusion of individual extractors, exploiting structural characteristics of the multimedia streams and the content domains and descriptions about concepts modelled into ontologies. The framework will be based on service-oriented architectures, allowing easy updates of its components.[3]
META-NET (2010-2012)
Bringing together researchers, commercial technology providers, private and corporate language technology users, language professionals and other information society stakeholders. META will prepare the necessary ambitious joint effort towards furthering language technologies as a means towards realising the vision of a Europe united as one single digital market and information space. [4]
LIREC (2008-2012)
LIREC aims to establish a multi-faceted theory of artificial long-term companions, embody this theory in robust and innovative technology and experimentally verify both the theory and technology in real social environments. LIREC will advance understanding of the concepts of embodiment, autobiographic memory and social interactions in the context of companions where the ‘mind’ might migrate to differently embodied ‘bodies’. [5]
COST-2102 (2006-2010)
The main objective of this COST Action is to develop an advanced acoustical, perceptual, and psychological analysis of verbal and nonverbal communication signals originating in spontaneous face-to-face interaction, in order to identify algorithms and automatic procedures capable of identifying the human emotional states. [6]
COST-2103 (2006-2010)
The main objective of the Action is to combine previously unexploited techniques with new theoretical developments to improve the assessment of voice for as many European languages as possible, while acquiring in parallel data with a view to elaborating better voice production models. [7]
I-Dash (2008-2010)
I-Dash is a project within the Safer Internet Plus programme. It focuses on the development of automatic tools to support police professionals in their investigations which involve large quantities of child abuse video material. [8]
VidiVideo (2007-2010)
The VIDIVideo project takes on the challenge of creating a substantially enhanced semantic access to video, implemented in a search engine. The engine will boost the performance of video search by forming a 1000 element thesaurus detecting instances of audio, visual or mixed-media content. [9]
ECESS ()
ECESS is a European initiative to foster the European research area in the field of Language Technology. As partner in this network where major industry partners such as IBM, Siemens and Nokia as well as European key research institutes are connected, L2F plays an active role in the R&D activities. We are responsible for the intelligibility and naturalness of the synthesized utterances while developing the acoustic module of the speech synthesis software. [10]

National

PoSTPort (2008-2010)
PoSTPort's goal is to port spoken language technologies originally developed for European Portuguese to the South-American and African varieties of Portuguese. The two main technologies to be investigated are speech synthesis and recognition. The project also involves corpora collection and characterization of the main differences between varieties, as well as their automatic identification, for switching among specific recognition systems.
REAP.PT (2009-2012)
REAP.PT aims to build a Portuguese version of the REAP tutoring system, developed at LTI to support the teaching of a language for either native or foreign speakers, through the activity of reading and focusing the students in learning vocabulary in context. In the first phase, a baseline version for European Portuguese (equivalent to the current English system) will be built. In the second phase, the project will concentrate on the open research challenges presented by the possibility of learning from current texts on topics of the student's interests.
PT-STAR (2009-2012)
Speech-to-Speech Machine Translation (S2SMT) technologies aim at enabling natural language communication between people that do not share the same language. S2SMT can be seen as a cascade of three major components: Automatic Speech Recognition, Machine Translation and Text-to-Speech Synthesis. One of the main problems of this multidisciplinary area, however, is the still weak integration between the three components. The main goal of PT-STAR is to improve speech translation systems for Portuguese by strengthening this integration.
ARIA (2010-2012)
The ARIA project aims at defining a framework for assisted reading, particularly targeted for elderly communities. It envisages a broad and active perspective of reading, where annotation and group communication tasks complement and build upon the main reading activity, taking advantage of a digital proactive medium. It addresses elderly individuals through the adoption of an assisted and adaptive stance that adapts document content, navigation, presentation, and interaction.
VITHEA (2010-2012)
The goal of the VITHEA project is the development of a software program for the treatment of aphasic patients, incorporating recent advances of speech and language technology. By means of the use of automatic speech recognition (ASR) technology, the program must be able to recognize what was said by the patient and to validate if it was correct or not. The "virtual therapist" must be able to provide help to the user whenever it is asked for both semantically and phonologically, that is, both as a written solution or as a speech synthesized production based on text-to-speech (TTS) technology.
FalaComigo (2010-2013)
FalaComigo aims to develop a solution to "Enhance the Cultural Tourism through the Interaction with Virtual Characters", by providing a set of applications, that will be deployed in various places of touristic interest. Through these solutions, FalaComigo team will provide new and compelling ways of interacting with visitors, supplying a remarkable sensory experience. On the basis of development of these solutions we find a spoken dialogue system with speech recognition and synthesis, 3D facial animation, spoken dialogue management systems and question/answer technologies.
OOBIAN (2010-2012)
Construction of lexical and grammatical resources for local integration in the mechanism that recognizes syntactic-semantic relationships on text. Integration with an indexing system.
AVOZ (2012-2013)
The AVoz project proposes to conduct an in-depth study of automatic speech recognition for this type of speech, in order to improve ASR performance. Project Website: https://avoz.l2f.inesc-id.pt
SUSPECT (2012-2015)
The goal of this project is to develop privacy-preserving frameworks for processing voice data. Processing will be performed without having access to the voice, i.e., access to any form of the speech that can be analyzed to obtain information about the talker or what they spoke. Using a combination of tools from cryptography and secure-multiparty computation we will render voice processing algorithms secure, so that the privacy of all parties is preserved. We specifically propose to develop solutions for secure speaker verification and keyword spotting problems. We envision a mechanism whereby police/security agencies could obtain publicly- accepted forms of legal sanction to look for pre-specified voices or phrases. The privacy-preserving framework will ensure that they are only notified when these occur, but will have no access to the voice data itself, thus preserving citizens’ privacy.

Finished

International

  • E-Circus - Education through Characters with Emotional Intelligence and Role-playing Capabilities that Understand Social Interaction (2006-2008)
  • COST 277 - Nonlinear Speech Processing (2001-2005)
  • COST 278 - Spoken Language Interaction in Telecommunication (2001-2005)
  • ALERT - Alert System for Selective Dissemination of Multimedia Information (2000-2002)
  • AUDIOLING-LP - Multimedia course for foreign students of the Portuguese language (Socrates Program)
  • SPEECHDAT - Speech Databases for for Creation of Voice Driven Teleservices (1994-1999)
  • VODIS - Advanced Speech Technologies for Voice Operated Driver Information Systems (1995-1999)
  • SPRACH - Speech Recognition ALgorithms for Connectionist Hybrids (1995-1998)
  • ELSNET - European Network in Language and Speech

National

  • FleetMod - FleetMod (simulate and predict the behavior of the skippers of fishing vessels to provide a framework to test the effectiveness of different management policies) (2008-2011)
  • StopFire - StopFire (a distributed intelligent system for forest fire combat aid) (2007-2011)
  • Tecnovoz - Tecnologia de Reconhecimento e Síntese de Voz (2006-2008)
  • RiCoBa - Rich Content Books for All (2005-2007)
  • LECTRA - Rich Transcription of Lectures for E-Learning Applications (2005-2007)
  • NLE GRID - Natural Language Engineering on a Computational Grid (2005-2007)
  • WFST - Weighted Finite State Transducers Applied to Spoken Language Processing (2004-2007)
  • DIGA - Dialog Interface for Global Access (2004-2007)
  • PAPOUS - The Story Teller (2003-2005)
  • IPSOM - Indexing, Integration and Sound Retrieval in Multimedia Documents (2000-2004)
  • ATA - Automatic Terms Acquisition (2001-2003)
  • FALA2 (2000-2003)
  • CITE-IV - Augmentative Communication Tools in Portuguese (1999-2000)
  • DIXI+ - A Text-to-Speech Synthesizer in Portuguese for Alternative and Augmentative Communication (1999-2001)
  • REC - Speech Recognition Applied to Telecommunications (1997-2000)
  • PRAXIS_FALA - Reconhecimento de fala de Alto Desempenho em Português (1997-1999)
  • CORAL - Labelled Spoken Dialogue Corpus (1997-1999)
  • BDFALA (in Portuguese) - Spoken Database for European Portuguese (1994-1998)
  • EDIFALA (in Portuguese) - Vocal Support System for Oral and Motor Handicapped (1993-1997)

Bilateral contracts

  • LIFAPOR - Spoken Books in European and Brazilian Portuguese (2005-2007)
  • ARARA - Automatic directory assistant service for Portuguese Telecom, together with Philips Speech Techonology (1999-2001)
  • SVIT (in Portuguese) - Partial Automation of Directory Services Based on Synthesis of Telephone Numbers (1995-1999)

Finished before 1995

  • WERNICKE - A Neural Network Based, Speaker Independent, Large Vocabulary, Continuous Speech Recognition System (1992-1995)
  • RELATOR - A European Network of Repositories for Linguistic Resources (1993-1994)
  • SAM_A (ESPRIT III) - Multi-Lingual Speech Input/Output Assessment, Methodology and Standardization (1992-1993)
  • ONOMASTICA (Language Research Engineering) - Multi-Language Pronounciation Dictionary of Proper Names and Place Names (1993-1995)
  • SUNSTAR (ESPRIT II) - Integration and Design of Speech Understanding Interfaces (1989-1992)
  • HCM-ELSNET (Human Capital Mobility) - Phrase Level Phonology and Dialogue & Discourse (1994-1996)
  • EUREKA 151 - High quality speech coding at medium-to-low bit rates (1987-1990)
  • COST 229 - Applications od Digital Signal Processing to Telecommunications (1990-1993)
  • COMETT - A Trans-European Platform for Transferable Continuing Education in Digital Signal Processing (1990-1993)