https://www.hlt.inesc-id.pt/wiki/index.php?title=Audio_indexation&feed=atom&action=historyAudio indexation - Revision history2024-03-28T15:45:55ZRevision history for this page on the wikiMediaWiki 1.41.0https://www.hlt.inesc-id.pt/wiki/index.php?title=Audio_indexation&diff=2814&oldid=prevMeinedo: Audio indexing moved to Audio indexation2006-07-03T18:13:13Z<p>Audio indexing moved to Audio indexation</p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<tr class="diff-title" lang="en">
<td colspan="1" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="1" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 18:13, 3 July 2006</td>
</tr><tr><td colspan="2" class="diff-notice" lang="en"><div class="mw-diff-empty">(No difference)</div>
</td></tr></table>Meinedohttps://www.hlt.inesc-id.pt/wiki/index.php?title=Audio_indexation&diff=2801&oldid=prevMeinedo: BN Audio Pre-processing moved to Audio indexing2006-07-03T17:07:23Z<p>BN Audio Pre-processing moved to Audio indexing</p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<tr class="diff-title" lang="en">
<td colspan="1" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="1" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 17:07, 3 July 2006</td>
</tr><tr><td colspan="2" class="diff-notice" lang="en"><div class="mw-diff-empty">(No difference)</div>
</td></tr></table>Meinedohttps://www.hlt.inesc-id.pt/wiki/index.php?title=Audio_indexation&diff=2793&oldid=prevMeinedo at 11:21, 3 July 20062006-07-03T11:21:56Z<p></p>
<p><b>New page</b></p><div>Broadcast News media monitoring is an important technology but poses a<br />
number of difficulties and challenges for speech processing both in<br />
terms of computational complexity and transcription accuracy. In this<br />
kind of application the speech signal not only has to be transcribed<br />
but also characterized in terms of acoustic content. Most present day<br />
transcription systems perform some kind of audio characterization<br />
(segmentation and labelling) as a first step in the processing<br />
chain.<br />
<br />
Audio pre-processing offers some practical advantages: no waste of<br />
time on the processing of non-speech intervals, no need to process<br />
very long speech chunks, facilitation of gender or speaker dependent<br />
acoustic model selection during recognition. On the other hand,<br />
indexing errors may cause extra transcription errors, if a speaker<br />
change is hypothesized in the middle of an utterance, or even worse,<br />
in the middle of a word.<br />
<br />
<br />
[[Image:audio_pre_processor.v1.block_diagram.png|center]]<br />
<br />
<br />
To accomplish this audio characterization the first stage in our media<br />
monitoring system is an audio pre-processor. This pre-processor is<br />
responsible for<br />
<br />
# Segmentation of the signal into acoustically homogeneous regions<br />
# Classification of those segments according to speech and non-speech intervals, background conditions, speaker gender<br />
# Identifying all segments uttered by the same speaker<br />
<br />
The segmentation provides information regarding speaker<br />
turns and identities allowing for automatic retrieval of all<br />
occurrences of a particular speaker. The segmentation can also be used<br />
to improve performance through adaptation of the speech recognition<br />
acoustic models. Additionally the final transcriptions enriched by the<br />
pre-processing information are somewhat more human readable.</div>Meinedo