Semantic Processing of Multimedia Contents

From HLT@INESC-ID

Problem

There are huge amounts of non-structured multimedia information, growing dailly, becoming very difficult their access and management, and without a real use of all these sources. The access to a service of selective dissemination of information is very expensive, due to the need of manual filtering of information, and very limited on his scope of media and subjects.

Goals

There are several activities associated with media monitoring undergoing a large expansion as a consequence of the different emerging media sources. In this activity line we are developing new full automatic systems able to scan multimedia data, from TV broadcasts, radio and newspaper texts, making a multimedia content analysis.

The systems are based on advanced processing technologies for content-based indexing of multimedia information. We are using large vocabulary speech recognition system, associated with audio segmentation and classification, automatic topic indexing and segmentation and summarization techniques, to generate category information as semantic markup of multimedia data. To be able to generate a knowledge representation the classification of data is based on a thesauri hierarchical structure and the resource description is based on an XML representation.

As a result of this task several system prototypes will be developed in order to provide useful end-user services based on the content multimedia exploitation.

Projects

Demos

In order to show the developments several prototypes and demos are made. This is the case of a prototype resulting from the ALERT project: SSNT - Summarization of Broadcast News Services.