Privacy-Preserving Speech and Audio Processing

Bhiksha Raj

Bhiksha Raj is an Associate Professor in the Language Technologies Institute of the School of Computer Science at Carnegie Mellon University, with additional affiliations to the Electrical and Computer Engineering and Machine Learning departments. Dr. Raj obtained his PhD from CMU in 2000 and was at Mistubishi Electric Research Laboratories from 2001-2008. Dr. Raj's chief research interests lie in automatic speech recognition, computer audition, machine learning and data privacy. Dr. Raj's latest research interests lie in the newly emerging field of privacy-preserving speech processing, in which his research group has made several contributions.

Addresses: www mail

Date

14:00, Friday, September 28^th, 2012
Room 020, INESC-ID

Speaker

Bhiksha Raj, Carnegie Mellon University

Abstract

The privacy of personal data has generally been considered inviolable. On the other hand, in nearly any interaction, whether it is with other people or with computerized systems, we reveal information about ourselves. Sometimes this is intended, for instance when we use a biometric system to authenticate ourselves, or when we explicitly provide personal information in some manner. Often, however, it is unintended; for instance a simple search performed on a server reveals information about our preferences. An interaction with a voice recognition system reveals information to the system about our gender, nationality (accent), and possibly emotional state and age.

Regardless of whether the exposure of information is intentional or not, it could be misused, potentially setting us at financial, social and even physical risk. These concerns about exposure of information has spawned a large and growing body of research, addressing various issues about how information may be leaked, and how to protect it.

One area of concern are sound data, particularly voice. For instance, voice-authentication systems and voice-recognition systems are becoming increasingly popular and commonplace. However, in the process of using these services, a user exposes himself to potential abuse: as mentioned above the server, or an eavesdropper, may obtain unintended demographic information about the user by analyzing the voice and sell this information. It may edit recordings to create fake recordings the user never spoke. Other such issues can be listed. Merely encrypting the data for transmission does not protect the user, since the recepient (the server) must finally have access to the data in the clear (i.e. decrypted form) in order to perform its processing.

In this talk, we will discuss solutions for privacy-preserving sound processing, which enable a user to employ sound- or voice-processing services without explosing themselves to risks such as the above.

We will describe the basics of privacy-preserving techniques for data processing, including homomorphic encryption, oblivious transfer, secret sharing, and secure-multiparty computation. We will describe how these can be employed to build secure "primitives" for computation, that enable users to perform basic steps of computation without revealing information. We will describe the privacy issues with respect to these operations. We will then briefly present schemes that employ these techniques for privacy-preserving signal processing and biometrics.

We will then delve into uses for sound, and particularly voice processing, including authentication, classification and recognition, and discuss computational and accuracy issues.

Finally we will present a newer class of methods based on exact matches built upon locality sensitive hashing and universal quantization, which enables several of the above privacy-preserving operations at a different operating point of privacy-accuracy tradeoff.

Note: This seminar will be held in English.

Privacy-Preserving Speech and Audio Processing

From HLT@INESC-ID

Date

Speaker

Abstract