Dynamic use of Ontologies to enhance Dialogue Systems

From HLT@INESC-ID

Joana Paulo

Abstract

To develop spoken dialogue systems where customization to new domains is easier, systems’ core should only be concerned with interacting with the user. Accessing background systems or reasoning about the domain should be avoided and delegated to a module responsible for the management of the domain knowledge. Some recent work has shown that ontologies have a role on the domain knowledge representation as the knowledge collected in an ontology can be used in all the spoken system modules: recognition (since it has the domain specific terms); interpretation (mapping synonyms to concepts); generation (choosing preferred terms); and dialogue manager (resolving under/over specifications, augmenting dialogue coherence by ordering the system’s initiative interactions according to relatedness). This work aims to follow the footsteps of the use of ontologies in spoken dialogue systems and take it further as the current state of the art only uses taxonomical (IS-A and PART-OF relations) knowledge.

State of the Art

Spoken Dialogue Systems

Since the 80s, the Natural Language Processing community uses spoken dialogue systems as a case study [1]. This option is explained by the simplicity that comes from the treatment of restricted domains. The multidisciplinarity involved is one of the richnesses of this field as it brings together people from several communities like signal processing (for speech recognition [2] and synthesis [3]); artificial intelligence (for interpretation of the spoken utterances [4]); software engineering (for the proposal of more efficient architectures [5]). But the complexity of these systems make them expensive to develop [6], and difficult to adapt [7] to new types of users, new services, new languages and new scenarios.

Ontologies

The field of Ontologies appeared in the 90s [8], but only lately has been perceived as more valuable, as some effective results are being achieved with their use, reuse and sharing. Ontologies aim at capturing static domain knowledge in a generic way and providing a commonly agreed understanding of a given domain. The main purpose is to share and reuse that knowledge across applications. Being so, an ontology is a shared specification of a conceptualization. Mainly, a domain ontology collects the relevant concepts of a domain and the relations between them. An ontology usually also represents some formal restrictions verified in the domain. Therefore, usually ontologies have three types of entities: classes; relations; and axioms. The classes hierarchy defined with IS-A relations correspond to the domain taxonomy. Currently the main challenges in this area include the definition of a clear building process [9], automatic learning of ontologies [10], transparent access to information [11] and efficient inference based on the available knowledge [12].

Use of Ontologies in Dialogue Systems

Separating the domain knowledge from the language features of the spoken dialogue systems has proven to reduce the complexity of dialogue systems’ components [13, 14]. Moreover, if the domain knowledge is already available, reusing it is crucial to reduce the effort needed to build a new dialogue system on that subject. Some recent work has shown the advantages of the use of Ontologies to enhance spoken dialogue systems or to extend them. Milward [13] defends that the ontology-based dialogue system for home information and control provides a dynamically reconfigurable system were new devices can be added and users can subscribe to new ones; asynchronous device input is allowed; unnatural scripted dialogues are avoided; and a flexible multimodal interaction for all users including the elderly and the disabled is provided. Also the recognition, interpretation, generation and dialogue management are more flexible as the knowledge coded on the ontology can be used dynamically. Flycht-Eriksson [14] argues that the separation of the dialogue management from the domain knowledge management is crucial to reduce the complexity of the systems and enhance further extensions. Both this works focus on the IS-A and PART-OF relations to solve under / over specification. This is helpful in medical-related dialogue systems that need taxonomical knowledge of the domain. Using more relations is yet a challenge as the complexity increases.

Goals

The main goal of this project is to enhance spoken dialogue systems to make them more general and domain-independent. This means that knowledge should be introduced in the system easily and transparently. To do this, the dialog management should be separated from the domain knowledge management. This should be done not only by dedicating a module on the system to it (the service manager) that has to be adapted to each domain, but, what is more, by defining the kind of domain knowledge needed and creating an abstraction to represent it. For example, the dialogue system needs to know the possible words in the next expected response from the user and that depends only on the domain. Contributions from the ontologies field will be explored in regard to knowledge manipulation in a generic spoken dialogue system. Some work has been done in the field but, at least for now, most of the work is reduced to the hierarchical knowledge (classes and IS-A relations) and under/over specification (PART-OF relations) that usually are represented on the ontologies. The extra-taxonomical knowledge is still being ignored but should be considered. The most interesting topic is whether ontologies can enrich the spoken dialogue systems and be used by them in such a way that the system can abstract the source of knowledge focusing only on the dialogue phenomena and no architecture adaptation has to be done in order to include new domains. The definition of the dialogue system as the instantiation of a spoken dialogue system will be explored after the existing dialogue systems and ontologies have been studied and categorized according to the tasks they perform and the used knowledge sources.

Detailed Description

Context

At the present time, the Spoken Language Systems Lab (L2F) integrates a project in the “House of the” at the Portuguese Communications Foundation. The house has a spoken dialogue system [15, 16] based on TRIPS architecture [17, 18] where a virtual butler named Ambrósio helps the user in daily tasks that deal with devices and services, through speech commands. Whenever clarification is needed, further dialogue is entailed. Some problems are still open that need more domain knowledge to be solved (for example the treatment of synonymous). To act in response to the user, the system needs to know which devices are connected, which services are available and what actions can be performed. Currently, this information is stored for each service or device: the available operations, the needed parameters and the possible values for each one. This kind of architecture is very common in the field. Nevertheless it’s always necessary to adapt something in the system or create a dedicated database or, at least, a view of the data. Recent work from Filipe [19] has enhanced the access to the services and abstracted the databases view in order to create an Application Programming Interface (API). This is being study at the present time and will be taken into consideration as some advances have been produced. An ontology on the cooking domain has been built [20]. A survey in the major areas involved (ontologies and dialogue systems) is being made. Also a first version of a cooking butler is being implemented, that let the user chose from a list of recipes one that is dictated to him. Forward and rewind commands are available.

Advisors

Nuno J. Mamede received his PhD from Instituto Superior Tecnico (IST). He has been a lecturer at this University since 1982, and a researcher at INESC-ID, in Lisbon, since its creation in 1980. He participated in the foundation of L2F where hold a position in the Executive Board. His activities have been in the areas of written NLP.

H. Sofia Pinto is an Assistant Professor at the Department of Computer Science and Engineering of IST. She holds a PhD in Computer Science from IST. She was a visiting student at the Knowledge and Reuse group at LIA, School of Computer Science, at Polytechnic University of Spain. She was an invited researcher at AIFB and ISWeb. She has worked mainly in knowledge representation: ontologies, knowledge sharing and reuse, inheritance systems, and truth maintenance systems. She is a member of the IEEE SUO work group.

James F. Allen is the John Dessauer Professor of Computer Science at the University of Rochester, and Pace Eminent Scholar at the Institute for Human and Machine Cognition in Pensacola Florida. He received his Phd in Computer Science from the University of Toronto and received the Presidential Young Investigator award from NSF. A Fellow of the American Association for Artificial Intelligence, he won Presidential Young Investigator Award. He was editor-in-chief of the journal Computational Linguistics. Over the past twenty years, he has been the principle investigator of research grants toting up over $20 million from DARPA, ONR and NSF. He authored several books and numerous research papers in natural language understanding, knowledge representation and reasoning, and spoken dialogue systems.

Bibliography

[01] Ron A. Cole, Joseph Mariani, Hans Uszkoreit, Giovanni Batista Varile, Annie Zaenen, Antonio Zampolli, Victor Zue (editors), 1997. Survey of the State of the Art in Human Language Technology. Center for Spoken Language Understanding CSLU, Carnegie Mellon University, Pittsburgh, PA.
[02] Daniel Jurafsky and James H. Martin, 2000. Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics. Prentice-Hall, ISBN: 0-13-095069-6.
[03] Xuedong Huang, Alex Acero, and Hsiao-Wuen Hon, 2001. Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice Hall, ISBN: 0-13-022616-5.
[04] James F. Allen, 1994. Natural Language Understanding. Benjamin Cummings, 1987, 2nd Edition.
[05] Michael F McTear, 2002. Spoken dialogue technology: enabling the conversational interface. ACM Computing Surveys, Volume 34 , Issue 1, pages 90 - 169.
[06] James Allen, Donna Byron, Myroslava Dzikovska, George Ferguson, Lucian Galescu, and Amanda Stent, 2000. An Architecture for a Generic Dialogue Shell. Natural Language Engineering, 6(3).
[07] Markku Turunen and Jaakko Hakulinen, 2003. Jaspis2 - An Architecture For Supporting Distributed Spoken Dialogues. In Proceedings of the Eurospeech’ 2003: pages 1913-1916.
[08] Thomas R. Gruber, 1993. A translation approach to portable ontology specifications. Knowledge Acquisition, 5(2):199–220. ISSN: 1042-8143
[09] H. Sofia Pinto and João Pavão Martins, 2004. Ontologies: How can they be built? Knowledge Information Systems, 6(4), pages 441–464.
[10] Alexander Maedche and Steffen Staab, 2004. Ontology learning. In Handbook on Ontologies, Steffen Staab and Rudi Studer (editors), chapter 9, pages 173-190. Springer, International Handbooks on Information Systems. ISBN: 3-540-40834-7.
[11] Yolanda Gil, Enrico Motta, V. Richard Benjamins and Mark Musen (editors), 2005. The Semantic Web. 4th International Semantic Web Conference, ISWC’2005, Ireland, Proceedings Series: Lecture Notes in Computer Science, Vol. 3729. ISBN: 3-540-29754-5.
[12] Franz Baader, Diego Calvanese, Deborah McGuinness, Daniele Nardi, and Peter F. Patel-Schneider (editors), 2003. The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press.
[13] David Milward and Martin Beveridge, 2003. Ontology-based dialogue systems. In 3rd Workshop on Knowledge and Reasoning in Practical Dialogue Systems – 18th International Joint Conference on Artificial Intelligence
[14] Annika Flycht-Eriksson, 2004. Design and Use of Ontologies in Information-providing Dialogue Systems. Dissertation, Linköping Studies in Science and Technology, Thesis n. 874, School of Engineering, Linköping University. ISBN: 91-7373-947-2.
[15] Pedro Madeira, Márcio Mourão and Nuno J. Mamede, 2003. STAR – a multiple domain dialog manager. In Proceedings of the fifth International Conference on Enterprise Information Systems, Angers, France, pages 490–494.
[16] Márcio Mourão, Renato Márcio de França Cassaca, Nuno J. Mamede, 2004. An Independent Domain Dialog System Through a Service Manager. In Proceedings of the 4th International Conference EsTAL - España for Natural Language Processing, Springer-Verlag, pages 161-171.
[17] Allen, James, George Ferguson and Amanda Stent, 2001. An architecture for more realistic conversational systems. In Proceedings of Intelligent User Interfaces 2001 (IUI-01), Santa Fe, New Mexico, United States, pages 1–8.
[18] James Allen, George Ferguson, Amanda Stent, Scott Stoness, Mary Swift, Lucian Galescu, Nathan Chambers, Ellen Campana and Gregory Aist, 2005. Two diverse systems built using generic components for spoken dialogue (recent progress on TRIPS). In A. Arbor (Ed.), Proceedings of the Interactive Poster and Demonstration Sessions at the 43rd Annual Meeting of the Association for Computational Linguistics (ACL-2005), Michigan, USA, pages 85–88.
[19] Porfírio Pena Filipe, Nuno J. Mamede, 2004. Towards Ubiquitous Task Management. In ICSLP'2004 - 8th International Conference on Spoken Language Processing (Interspeech'2004).
[20] Fernando M. Batista, Joana L. Paulo, Nuno J. Mamede, Paula C. Vaz and Ricardo D. Ribeiro, 2005. Ontology construction: cooking domain. Technical report for the Knowledge Sharing and Reuse course, Instituto Superior Técnico, Universidade Técnica de Lisboa, Lisboa, Portugal.

Publications arising from this thesis