A multi-microphone approach to speech processing in a smart-room environment


Revision as of 18:48, 21 February 2007 by David (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Alberto Abad Gareta


  • 14:30, February 23, 2007



Recent advances on computer technology, speech and language processing, or image processing, have made possible that some new ways of person-machine communication and computer assistance to human activities start to appear feasible. Concretely, the interest on the development of new challenging applications in indoor environments equipped with multiple sensors, also known as smart-rooms, has considerably grown in the last times.

In the last years the UPC has been participating in the EU funded CHIL project -Computers in the Human Interaction Loop-. The project was mainly aimed to develop intelligent services capable to assist and complement human activities, requiring the minimal possible awareness from the users. Consequently, there was a need of perceptual user interfaces which were multimodal and robust, and which used unobtrusive sensors.

My most recent work is precisely related with the acoustic research activities carried out at the UPC in the context of the CHIL project. Particularly, I have been investigating the use of multi-microphone approaches to speech processing as a possible solution to the problems that appear in the deployment of hands-free speech applications in real room environments. First, I will describe some of the work carried out on ASR with microphone arrays. Then, I will also briefly comment my work related to speaker tracking and head orientation estimation.