Recent work in the VIDIVIDEO project by the L2F team

Date

15:00, November 21, 2008
Room 336

Speakers

Isabel Trancoso, Rui Martins, António Serralheiro, José Portêlo, Miguel Bugalho, Thomas Pellegrini, Alberto Abad

Abstract

The goal of the European project VIDIVIDEO is to boost the performance of video search engines by forming a 1000 element thesaurus detecting instances of audio, video or mixed-media content. The project applies machine learning techniques to learn many different detectors from examples, using one-against-all classifiers.

This talk will start by a brief overview of the different audio-related tasks: audio segmentation, audio event detection (AED) and speech recognition (Isabel Trancoso). The current work on gender segmentation (male/female/child) and music segmentation will be very briefly mentioned (Rui Martins, António Serralheiro).

The remaining of the talk will focus on the second task (AED), starting with a presentation of the audio concepts in the VIDIVIDEO ontology (Miguel Bugalho) and the training and test corpora. The machine learning approaches we have followed and the corresponding results will be presented next (José Portêlo). We will then present some recent results with hierarchical clustering of audio events (Thomas Pellegrini).

This work involved the development of new tools that can be of interest not only for AED, but for speech processing in general. Alberto Abad will describe some of these tools, which have recently been integrated in the Audimus framework.

Recent work in the VIDIVIDEO project by the L2F team

From HLT@INESC-ID

Date

Speakers

Abstract