DIGA - Dialog Interface for Global Access

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Develop a conversational interface that will allow users to obtain information, stored in a database, over the telephone using speech.

Sponsored by: FCT (POSI/PLP/14319/2001)
Start: January 2004
Duration: 3 years

Team (when the project started)

Project Leader: Nuno Mamede

Graduate Students:

Luísa Coheur (got the Ph.D on Dec/2004)
David Matos (got the Ph.D on July/2005)
Porfírio Filipe (Ph.D)
Sérgio Paulo (Ph.D)
Márcio Mourão (got the M.Sc on Nov/2005)
João Graça (M.Sc)

Other:

Summary

The primary goal of this project is integration. It is intended as a vehicle for bringing together, for the first time, research contributions from all the members of the recently created Spoken Language Systems Lab (L2F - Laboratório de sistemas de Língua Falada) of INESC ID. Hence, it will integrate teams with very different expertise - speech processing, neural networks and natural language processing - who will join efforts to develop a conversational interface for accessing and retrieving online information.

Whereas from an international point of view the project objectives may not seem too ambitious, given the much more advanced state of the art in language engineering for other languages, for European Portuguese, they do represent a very significant research effort. Building a prototype of a spoken dialog interface using state of the art core language technologies is therefore the first step towards being able to address in the future innovative research areas such as multilingual information access, animated multimodal conversational agents.

The main tasks in developing research testbeds for communicative interaction include developing: medium and large vocabulary continuous speech recognition modules, speech synthesis modules, natural language understanding and generation modules, and DMs.

In spite of aiming at a general architecture, minimising domain dependency, the system will have to be trained and tested in a particular domain. We shall start by specifying the functionality of a relatively simple demonstrator, based on a mixed-initiative dialog approach in a very limited domain such as, for instance, meteorology. At a later stage, the portability of the approach will be tested in a more complex domain such as travelling and scheduling. The final task of the project will be the evaluation of the two demonstrators.

Objectives

The main objective of this project is to integrate different language technologies in order to develop a conversational interface that will allow users to obtain information, stored in a database, over the telephone using speech.

In developing two conversational interface demonstrators, this project will contribute to the improvement of core language technologies that have not been integrated before by the working team. By bringing together researchers with different areas of expertise, we expect to be able to integrate, on the one hand, speech recognition and natural language understanding, and on the other hand, speech synthesis with natural language generation. Moreover, we plan to integrate these core language technologies with dialog management and user modelling.

Such a project will also be of primary importance in setting up the research infrastructure for a relatively large number of doctoral (3) and master (3) theses.

Workplan

T1 - Demonstrator Specification
T2 - System Architecture
T3 - Speech recognition
T4 - Query understanding
T5 - Dialogue Representation and Management
T6 - Language generation
T7 - Speech Synthesis
T8 - Integration and Evaluation

DIGA - Dialog Interface for Global Access

From HLT@INESC-ID

Contents

Team (when the project started)

Summary

Objectives

Workplan