This is the presentation of the SSNT service through a detailed description of its features. During this description you can discover several relevant aspects for a complete perception of this new service that we are offering. If you are not interested in such a detailed description and want just a quick view of the service we propose the first news of the last processed "Telejornal" news show:
If you are interested in a detailed description of the service we propose the following set of points:
Nowadays there is a significant need to deal with large amounts of multimedia information. With this service we want to develop a selective dissemination of multimedia contents, mainly of TV broadcast news. The use of advanced techniques for the processing of BN programs, through a segmentation and categorization process, made possible the access to the contents of the programs based on an individual definition of the user profiles.
Through this new service we made available the 8 o'clock news program of the main channel of RTP (Telejornal). The users are able to define which thematic areas they are interested and after the automatic processing of the program they receive an email with the news that fit to the requested domains. Be one of them and start using now this new service.
This system was initially developed in consortium in the scope of the European project ALERT between INESC ID Lisboa, 4VDO and RTP. The developments of the large vocabulary continuous speech recognition system have been supported by the project POSI/33846/PLP/2000 financed by FCT.
In the next figure a functional diagram of the service is presented.
As we can observe from the functional diagram, the system analyses a generic multimedia document and based on the contents segment it in coherent blocks, through a video and/or audio segmentation.
When the document contains audio an automatic transcription is performed through a large vocabulary continuous speech recognition system. Based on the block segmentation and on the text inside each block, resulting from the transcription or because is only a text document, an automatic detection of topics is performed in each block, with the possibility of clustering together several blocks in homogeneous segments according to the topics contents.
With the multimedia document divided into segments, and a set of topics assigned to each segment, a search is performed on the user profiles requiring the topics from that segments and an alert message is generated for that users.
At the end of the process the multimedia document is loaded into a database where we keep the document segmentation and the appropriate categorization in topics.
The development of the system was based on a three main blocks structure: the CAPTURE block, responsible for the capture of the monitoring defined programs, the PROCESSING block, responsible to generate the relevant markup information associated to each program, and the SERVICE block, responsible for the user interface and database management. The control of the overall process is based on a simple semaphore scheme.
From a list of news shows that we intent to monitor a web script schedules the recordings by downloading from the TV station web site the daily time schedule (expected starting and ending time). It is frequent that the actual news show duration is larger than what had been advertised in the time schedule by the TV station. To ensure we record the complete show our script programs the recording to start a little earlier (1 minute) than announced and records much more time after the advertised ending time (20 minutes).
The capture script records the specified news show at the defined time using a TV capture board (Pinnacle PCTV Pro) that has direct access to a TV cable network. The recording produces two independent streams: a MPEG-2 video stream and a uncompressed, 44.1 kHz, mono, 16 bit audio stream. After the recording finishes the audio stream generated is downsampled to 16 kHz. Finally a flag file (.ready) is also generated. This signal triggers the PROCESSING block.
After the PROCESSING block sends back jingle detection information changing the signal flag from (.ready) to (.proc) the CAPTURE block starts multiplexing the recorded video and streams together and using the jingle detection information to cut out unwanted portions effectively producing an AVI file with only the news show. This multiplexed AVI file has MPEG-4 video (DivX 5.2) and MP3 audio.
After the PROCESSING block finishes processing and sends back the XML file signalling the CAPTURE block with a flag (.proc.2) the CAPTURE block starts generating individual AVI video files for each news report from the AVI containing the full news show. These individual AVI files have less video quality which is suitable for streaming to portable devices.
All the AVI video files generated are sent to the SERVICE block for conversion to Real Media format, the format we use for video streaming over the web.
The audio stream generated is processed through several stages that successively segment, transcribe and index it, compiling the resulting transcription and metadata information into a XML file. The stages are:
First the recorded audio file is processed by the "Jingle Detection" module which identifies the precise news show start and end times identifies certain portions of audio that are not relevant to the story (news fillers) and detects commercial breaks inside the news show. Based on these time instants a new audio file is generated with only the relevant contents of the news show.
The new audio file is then fed through an Audio Pre-processor module. This module outputs a set of homogeneous acoustic segments discriminating between speech and non-speech. The speech segments contain audio from only one speaker and have markups concerning the background conditions, the speaker gender, and speaker clustering.
Each audio segment that was marked by the Audio Pre-processor as containing speech (transcribable segment) is then processed by the Speech Recognition module.
The Topic segmentation module groups segments to define a complete and homogeneous story. For each story the Topic Indexing module generates a classification, according to the hierarchically organized thematic thesaurus, about the contents of the story.
Finally a module for generating a Title and Summary is applied to each story. Since we will deliver to the user a set of stories, we would like to have a mechanism to give a close idea about the contents of the story besides the topic indexing.
After all these processing stages an XML file containing all the relevant information that we were able to extract is generated according to a DTD specification.
The SERVICE block is responsible for loading the XML file generated in the PROCESSING block into a database, converting the AVI video files into Real Media format, running the web video streaming server, running the web pages server for the user interface, managing the user profiles in a database and sending alert messages to the users resulting from the match between the news show information and the users profiles.
The first page supplly a short description of the service with three different access points. It is possible to register as a new user (this new user could apply for all the features of the service), allows to access to the system to reconfigure a pre registered user (through the username and password authentication), or a direct search over the database of segmented and indexed multimedia documents. Following we will give details over these different accessing points.
When choosing in the first page the SSNT registo the user enter in a new page where he must fill some personal data. The user becomes knew to the system through his username and email address.
After fill the personal information the user must press the button Actualizar Alterações, in the right below corner of the page. At that point the user should leave the system for receiving an email from the system with the final confirmation. In that email the user may find a link to confirm the registering as a new user and to access to the user profile definition.
The user profile definition is based on the choice of thematic domains, onomastic index, geographic index or free text. The profile definition results from an AND operation in each of these fields. To make effective the profile definition the user must click over the Adicionar button. At that time appears in the window of profile definition, in the below part of the page, a new thematic domain restricted by the possible indication of onomastic and geographic index or free text. The user may define different domains clustered together through an OR operation and that will appear in different lines in the window of profile definition. The user may also select different thematic domains simultaneously through the selective choice with the mouse, and for a thematic domain define in a more specific way the themes associated with that particular domain, till the maximum of two sub-levels.
After selecting the different items of his profile the user must finalize his choice or changes through the button Actualizar perfil in the right corner below of the page (red colour). After these steps the process for registering as new user is completed.
When a new program, containing news on the thematic domains and features defined on the user profile, enters in the system an email will be sent. The email contents will be described in the next section.
The email present for each news a set of fields. Starting by the program title, the date and length of the news. Following presents the title and a short summary of the news. Ends with the thematic domain, onomastic and geographic index, associated both to the news and the user profile. A link to the RealVideo file is supplied.
With this email the service associated to SSNT is complete.
The Direct Search mode has a configuration similar to the user profile definition, acting only over the programs stored in the database and not over new programs. The definition of the news to retrieve is similar to the user profile definition. After selecting the button Pesquisar, on the right below corner of the page, a search of the news in the database and according to the user request is performed.
An example of results is presented in the next figure. First is presented the thematic domain followed by the several news that obey to that search criterion. For each news is showed the news title, the program name, emission date, news length and summary. Finally there is the indication of thematic domain, onomastic and geographic index associated to the news. If the user wish he may perform a new search. Again is provided a link to the RealVideo file associated with that particular news.
With the detailed explanation of the contents of the different pages we presented this new service.
This system presents a set of innovative features based on speech processing techniques and topic detection. However due to the development conditions we know that the system still have a set of limitations. Among them we highlight the following ones:
After this detailed description you are in conditions to access to the system. We hope that you find in SSNT the necessary features to starting using this service. If you need any additional information or if you wish to draw any comment we are available here.