Coupling Pattern Recognition and Signal Processing

From HLT@INESC-ID

Ahmed Hussen Abdelaziz
Ahmed Hussen Abdelaziz
Ahmed Hussen Abdelaziz received the B.Sc. degree in Electrical Engineering and Information Technology from Alazhar University, Cairo, Egypt in 2006 and the M.Sc. degree in Electrical Engineering and Information Technology from Ruhr-Universität Bochum, Germany in 2010. He is currently pursuing the Ph.D. degree at Ruhr-Universität Bochum, Germany.

Since 2010 he has been a research and teaching assistant at the Digital Signal Processing Group, Institute of Communication Acoustics, Ruhr-Universität Bochum, Germany. His research interests include pattern recognition and signal processing.

Addresses: www mail

Date

  • 15:00, Friday, July 19th, 2013
  • Room 020, INESC-ID

Speaker

  • Ahmed Hussen Abdelaziz, Institut für Kommunikationsakustik, Ruhr-Universität

Abstract

Signal processing and pattern recognition are often treated as separate problems. However, tight coupling between them can yield a significantly improved performance in both of these tasks. In this talk, we will introduce two new approaches for such a stronger coupling, providing more precise input from signal processing to pattern recognition and vice versa. We start with coupling pattern recognition models with signal processing algorithms using a new statistical model, called the twin hidden Markov model (THMM), for speech enhancement. By using the THMM, hidden Markov models HMMs can be exploited to enhance speech signals in a recognize-and-synthesize scheme by using the most appropriate features in both recognition and synthesis. After that, we introduce a new approach for coupling signal processing with pattern recognition, called significance decoding (SD). The SD approach is a new uncertainty-of-observation technique that deploys the features uncertainties estimated by the signal processing algorithm to improve the recognition accuracy of automatic speech recognition under tough environmental conditions. Finally, we combine these two schemes in the context of audio-visual speech recognition in order to enhance its performance in very noisy environments.