Taking the Turn — Or Not: Turn Management in Spoken Dialogue Systems


Julia Hirschberg
Julia Hirschberg is a professor in the Department of Computer Science at Columbia University, currently on sabbatical at KTH in Stockholm. She received my PhD in Computer Science from the University of Pennsylvania, after previously doing a PhD in sixteenth-century Mexican social history at the University of Michigan and teaching history at Smith. She worked at Bell Laboratories and AT&T Laboratories — Research from 1985-2003 as a Member of Technical Staff and a Department Head, creating the Human-Computer Interface Research Department there. She served as editor-in-chief of Computational Linguistics from 1993-2003 and was an editor-in-chief of Speech Communication from 2003-2006. She was on the Executive Board of the Association for Computational Linguistics (ACL) from 1993-2003, have been on the Permanent Council of International Conference on Spoken Language Processing (ICSLP) since 1996, and served on the board of the International Speech Communication Association (ISCA) from 1999-2007 (as President 2005-2007). She is on the board of the CRA-W and has been active in working for diversity at AT&T and at Columbia. She has been a fellow of the American Association for Artificial Intelligence since 1994 and an ISCA Fellow since 2008. She received a Columbia Engineering School Alumni Association (CESAA) Distinguished Faculty Teaching Award in 2009.
  • 15:00, Friday, July 17th, 2009
  • Room V1.17, Civil Engineering building, IST-Alameda


  • Julia Hirschberg, KTH (Sweden) and Columbia University (USA)


Listeners have many options in dialogue: They may interrupt the current speaker, take the turn after the speaker has finished, remain silent and wait for the speaker to continue, or backchannel, to indicate that they are still listening -- without taking the turn I will discuss three of these options which are particularly difficult, yet particularly important, for systems to distinguish in Spoken Dialogue Systems: taking the turn vs. backchanneling vs. remaining silent and letting the speaker continue. How can the system determine which option the user is choosing? How can the system decide which option it should choose and how best to signal this to the user? I will describe results of an empirical study of these phenomena in the context of a larger study of human-human turn-taking behavior in the Columbia Games Corpus. This is joint work with Agustín Gravano (University of Buenos Aires).