Who spoke when: Difference between revisions

From HLT@INESC-ID

No edit summary
 
mNo edit summary
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
__NOTOC__
__NOTOC__
{| align="right" cellpadding="3" style='width: 35%; align: right; border-style: solid; border-width: 1px; background: #dee2ff; margin-left: 8px;'
{| align="right" cellpadding="3" style='width: 35%; align: right; border-style: solid; border-width: 1px; background: #dee2ff; margin-left: 8px;'
! style='text-align: center;' |David Sündermann
! style='text-align: center;' |Janez Žibert
|-
|-
| Janez Zilbert...  
| Janez Žibert just finished his PhD on the problem of structuring the audio data in terms of speakers.
|}
|}


== Date ==
== Date ==


* 15:00, November 26, 2006
* 15:00, November 23, 2006
* 4rd floor meeting room
* 4th floor meeting room


== Speaker ==
== Speaker ==


* [http://luks.fe.uni-lj.si/en/ Janez Zilbert]
* [http://luks.fe.uni-lj.si/en/staff/janezz/index.html Janez Žibert]


== Abstract ==
== Abstract ==


The thesis addresses the problem of structuring the audio data in
The thesis addresses the problem of structuring the audio data in terms of speakers, i.e., finding the regions in the audio streams that belong to one speaker and joining each region of the same speaker together. The task of organizing the audio data in this way is known as speaker diarization and was first introduced in the NIST project of Rich Transcription in "Who spoke when" evaluations. The speaker-diarization problem is composed of several tasks. This thesis addresses three of them: speech/non-speech segmentation, speaker- and background-change detection, and speaker clustering.
terms of speakers, i.e., finding the regions in the audio streams
that belong to one speaker and joining each region of the same
speaker together. The task of organizing the audio data in this way
is known as speaker diarization and was first introduced in the NIST
project of Rich Transcription in "Who spoke when"
evaluations. The speaker-diarization problem is composed of several
tasks. This thesis addresses three of them: speech/non-speech  
segmentation, speaker- and background-change detection, and speaker  
clustering.
 
The main objectives in our research were to develop new
representations of audio data that were more suitable for each task
and to improve the accuracy and increase the robustness of standard
approaches under various acoustic and environmental conditions. The
motivation for the improvement of the existing methods and the
development of new procedures for speaker-diarization tasks is the
design of a system for the speaker-based audio indexing of broadcast
news shows.


The main objectives in our research were to develop new representations of audio data that were more suitable for each task and to improve the accuracy and increase the robustness of standard approaches under various acoustic and environmental conditions. The motivation for the improvement of the existing methods and the development of new procedures for speaker-diarization tasks is the design of a system for the speaker-based audio indexing of broadcast news shows.


[[category:Seminars]]
[[category:Seminars]]
[[category:Seminars 2006]]
[[category:Seminars 2006]]

Latest revision as of 00:16, 24 November 2006

Janez Žibert
Janez Žibert just finished his PhD on the problem of structuring the audio data in terms of speakers.

Date

  • 15:00, November 23, 2006
  • 4th floor meeting room

Speaker

Abstract

The thesis addresses the problem of structuring the audio data in terms of speakers, i.e., finding the regions in the audio streams that belong to one speaker and joining each region of the same speaker together. The task of organizing the audio data in this way is known as speaker diarization and was first introduced in the NIST project of Rich Transcription in "Who spoke when" evaluations. The speaker-diarization problem is composed of several tasks. This thesis addresses three of them: speech/non-speech segmentation, speaker- and background-change detection, and speaker clustering.

The main objectives in our research were to develop new representations of audio data that were more suitable for each task and to improve the accuracy and increase the robustness of standard approaches under various acoustic and environmental conditions. The motivation for the improvement of the existing methods and the development of new procedures for speaker-diarization tasks is the design of a system for the speaker-based audio indexing of broadcast news shows.