To develop spoken dialogue systems where customization to new domains is easier, systems’ core should only be concerned with interacting with the user. Accessing background systems or reasoning about the domain should be avoided and delegated to a module responsible for the management of the domain knowledge. Some recent work has shown that ontologies have a role on the domain knowledge representation as the knowledge collected in an ontology can be used in all the spoken system modules: recognition (since it has the domain specific terms); interpretation (mapping synonyms to concepts); generation (choosing preferred terms); and dialogue manager (resolving under/over specifications, augmenting dialogue coherence by ordering the system’s initiative interactions according to relatedness). This work aims to follow the footsteps of the use of ontologies in spoken dialogue systems and take it further as the current state of the art only uses taxonomical (IS-A and PART-OF relations) knowledge.
Since the 80s, the Natural Language Processing community uses spoken dialogue systems as a case study . This option is explained by the simplicity that comes from the treatment of restricted domains. The multidisciplinarity involved is one of the richnesses of this field as it brings together people from several communities like signal processing (for speech recognition  and synthesis ); artificial intelligence (for interpretation of the spoken utterances ); software engineering (for the proposal of more efficient architectures ). But the complexity of these systems make them expensive to develop , and difficult to adapt  to new types of users, new services, new languages and new scenarios.
The field of Ontologies appeared in the 90s , but only lately has been perceived as more valuable, as some effective results are being achieved with their use, reuse and sharing. Ontologies aim at capturing static domain knowledge in a generic way and providing a commonly agreed understanding of a given domain. The main purpose is to share and reuse that knowledge across applications. Being so, an ontology is a shared specification of a conceptualization. Mainly, a domain ontology collects the relevant concepts of a domain and the relations between them. An ontology usually also represents some formal restrictions verified in the domain. Therefore, usually ontologies have three types of entities: classes; relations; and axioms. The classes hierarchy defined with IS-A relations correspond to the domain taxonomy. Currently the main challenges in this area include the definition of a clear building process , automatic learning of ontologies , transparent access to information  and efficient inference based on the available knowledge .
Separating the domain knowledge from the language features of the spoken dialogue systems has proven to reduce the complexity of dialogue systems’ components [13, 14]. Moreover, if the domain knowledge is already available, reusing it is crucial to reduce the effort needed to build a new dialogue system on that subject. Some recent work has shown the advantages of the use of Ontologies to enhance spoken dialogue systems or to extend them. Milward  defends that the ontology-based dialogue system for home information and control provides a dynamically reconfigurable system were new devices can be added and users can subscribe to new ones; asynchronous device input is allowed; unnatural scripted dialogues are avoided; and a flexible multimodal interaction for all users including the elderly and the disabled is provided. Also the recognition, interpretation, generation and dialogue management are more flexible as the knowledge coded on the ontology can be used dynamically. Flycht-Eriksson  argues that the separation of the dialogue management from the domain knowledge management is crucial to reduce the complexity of the systems and enhance further extensions. Both this works focus on the IS-A and PART-OF relations to solve under / over specification. This is helpful in medical-related dialogue systems that need taxonomical knowledge of the domain. Using more relations is yet a challenge as the complexity increases.
The main goal of this project is to enhance spoken dialogue systems to make them more general and domain-independent. This means that knowledge should be introduced in the system easily and transparently. To do this, the dialog management should be separated from the domain knowledge management. This should be done not only by dedicating a module on the system to it (the service manager) that has to be adapted to each domain, but, what is more, by defining the kind of domain knowledge needed and creating an abstraction to represent it. For example, the dialogue system needs to know the possible words in the next expected response from the user and that depends only on the domain. Contributions from the ontologies field will be explored in regard to knowledge manipulation in a generic spoken dialogue system. Some work has been done in the field but, at least for now, most of the work is reduced to the hierarchical knowledge (classes and IS-A relations) and under/over specification (PART-OF relations) that usually are represented on the ontologies. The extra-taxonomical knowledge is still being ignored but should be considered. The most interesting topic is whether ontologies can enrich the spoken dialogue systems and be used by them in such a way that the system can abstract the source of knowledge focusing only on the dialogue phenomena and no architecture adaptation has to be done in order to include new domains. The definition of the dialogue system as the instantiation of a spoken dialogue system will be explored after the existing dialogue systems and ontologies have been studied and categorized according to the tasks they perform and the used knowledge sources.
At the present time, the Spoken Language Systems Lab (L2F) integrates a project in the “House of the” at the Portuguese Communications Foundation. The house has a spoken dialogue system [15, 16] based on TRIPS architecture [17, 18] where a virtual butler named Ambrósio helps the user in daily tasks that deal with devices and services, through speech commands. Whenever clarification is needed, further dialogue is entailed. Some problems are still open that need more domain knowledge to be solved (for example the treatment of synonymous). To act in response to the user, the system needs to know which devices are connected, which services are available and what actions can be performed. Currently, this information is stored for each service or device: the available operations, the needed parameters and the possible values for each one. This kind of architecture is very common in the field. Nevertheless it’s always necessary to adapt something in the system or create a dedicated database or, at least, a view of the data. Recent work from Filipe  has enhanced the access to the services and abstracted the databases view in order to create an Application Programming Interface (API). This is being study at the present time and will be taken into consideration as some advances have been produced. An ontology on the cooking domain has been built . A survey in the major areas involved (ontologies and dialogue systems) is being made. Also a first version of a cooking butler is being implemented, that let the user chose from a list of recipes one that is dictated to him. Forward and rewind commands are available.
Nuno J. Mamede received his PhD from Instituto Superior Tecnico (IST). He has been a lecturer at this University since 1982, and a researcher at INESC-ID, in Lisbon, since its creation in 1980. He participated in the foundation of L2F where hold a position in the Executive Board. His activities have been in the areas of written NLP.
H. Sofia Pinto is an Assistant Professor at the Department of Computer Science and Engineering of IST. She holds a PhD in Computer Science from IST. She was a visiting student at the Knowledge and Reuse group at LIA, School of Computer Science, at Polytechnic University of Spain. She was an invited researcher at AIFB and ISWeb. She has worked mainly in knowledge representation: ontologies, knowledge sharing and reuse, inheritance systems, and truth maintenance systems. She is a member of the IEEE SUO work group.
James F. Allen is the John Dessauer Professor of Computer Science at the University of Rochester, and Pace Eminent Scholar at the Institute for Human and Machine Cognition in Pensacola Florida. He received his Phd in Computer Science from the University of Toronto and received the Presidential Young Investigator award from NSF. A Fellow of the American Association for Artificial Intelligence, he won Presidential Young Investigator Award. He was editor-in-chief of the journal Computational Linguistics. Over the past twenty years, he has been the principle investigator of research grants toting up over $20 million from DARPA, ONR and NSF. He authored several books and numerous research papers in natural language understanding, knowledge representation and reasoning, and spoken dialogue systems.