LECTRA - Rich Transcription of Lectures for E-Learning Applications

Producing automatic transcriptions of classroom lectures may be important for both e-learning and e-inclusion purposes.

The greatest research challenge is the recognition of spontaneous speech (error rate much higher than for read speech). Even human produced transcriptions would be very difficult to understand because of the absence of punctuation and the presence of disfluencies (filled pauses, repetitions, hesitations, false starts, etc.). Hence, one has to enrich the speech transcription by adding information about sentence boundaries and speech disfluencies.

Sponsored by: FCT (POSC/PLP/58697/2004)
Start: March 2005
Duration: 2 years

Team

Project Leader: Isabel Trancoso

Undergraduate Students:

Luís Neves

This project is done with the cooperation of IMMI (Intelligent MultiModal Interfaces), led by Prof. Joaquim Jorge.

Summary

The goal of this project is the production of multimedia lecture contents for e-learning applications. We shall take as a pilot study a course for which the didactic material (e.g. text book, problems, viewgraphs) is already electronically available and in Portuguese. This is an increasingly more frequent situation, namely in technical courses. Our contribution to these contents will be to add, for each lecture in the course, the recorded video signal and the synchronized lecture transcription. We believe that this synchronized transcription may be specially important for hearing-impaired students.

The project will encompass 5 main tasks. In the first one we shall collect the training and test material (both in terms of recorded audio-video signals and textual data) related to this course. In the second task we shall use this training data to adapt the acoustic, lexical and language models of our large vocabulary continuous speech recognizer to the course domain, thus yielding a first transcription of the lecture contents. The third task has as a goal to "enrich" this transcription with metadata that would render it more intelligible. Given the state of the art in terms of metadata extraction and the comparatively low recognition rate for spontaneous speech relative to read speech, this task is the one where the main research challenge resides. The fourth task deals with integrating the recorded audio-video and corresponding transcription with the other multimedia contents and synchronize them according to topic, so that a student may browse through the contents, seeing a viewgraph, the corresponding part in the text book, and the audio-video with the corresponding lecture transcription as caption. The final task is user evaluation for which we intend to use a panel of both normal hearing and hearing impaired students. For the latter, we shall evaluate two types of lecture transcription: with and without manual correction. This later evaluation will give us an indication of how close we are in terms of automatic lecture transcription to be able to use such tools in real-time in a classroom.

Workplan

T1 - Data collection
T2 - Model adaptation
T3 - Spontaneous speech recognition
T4 - Integration of lecture transcription with other multimedia conetnts
T5 - User evaluation

Demos

This demo shows the result of the application of our Broadcast News recognizer after adaptation of the acoustic, lexical and language models to the course domain (Production of Multimedia Contents).

LECTRA - Rich Transcription of Lectures for E-Learning Applications

From HLT@INESC-ID

Revision as of 08:15, 29 June 2006 by Imt (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Contents

Team

Summary

Workplan

Demos

LECTRA - Rich Transcription of Lectures for E-Learning Applications

From HLT@INESC-ID

Revision as of 08:15, 29 June 2006 by Imt (talk | contribs)(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Team

Summary

Workplan

Demos

Revision as of 08:15, 29 June 2006 by Imt (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)