https://www.hlt.inesc-id.pt/wiki/index.php?title=Language_identification_on_broadcast_news_(seminar)&feed=atom&action=historyLanguage identification on broadcast news (seminar) - Revision history2024-03-29T10:54:01ZRevision history for this page on the wikiMediaWiki 1.41.0https://www.hlt.inesc-id.pt/wiki/index.php?title=Language_identification_on_broadcast_news_(seminar)&diff=3631&oldid=prevLcoheur: /* Date */2007-01-16T15:54:37Z<p><span dir="auto"><span class="autocomment">Date</span></span></p>
<table style="background-color: #fff; color: #202122;" data-mw="interface">
<col class="diff-marker" />
<col class="diff-content" />
<col class="diff-marker" />
<col class="diff-content" />
<tr class="diff-title" lang="en">
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">← Older revision</td>
<td colspan="2" style="background-color: #fff; color: #202122; text-align: center;">Revision as of 15:54, 16 January 2007</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l4">Line 4:</td>
<td colspan="2" class="diff-lineno">Line 4:</td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>* 14:30, January 19, 2007</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>* 14:30, January 19, 2007</div></td></tr>
<tr><td class="diff-marker" data-marker="−"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>* 3rd floor meeting room (presentation in <del style="font-weight: bold; text-decoration: none;">Portuguese</del>)</div></td><td class="diff-marker" data-marker="+"></td><td style="color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>* 3rd floor meeting room (presentation in <ins style="font-weight: bold; text-decoration: none;">English</ins>)</div></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><br></td></tr>
<tr><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>== Speaker ==</div></td><td class="diff-marker"></td><td style="background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;"><div>== Speaker ==</div></td></tr>
</table>Lcoheurhttps://www.hlt.inesc-id.pt/wiki/index.php?title=Language_identification_on_broadcast_news_(seminar)&diff=3630&oldid=prevLcoheur at 15:53, 16 January 20072007-01-16T15:53:09Z<p></p>
<p><b>New page</b></p><div>__NOTOC__<br />
<br />
== Date ==<br />
<br />
* 14:30, January 19, 2007<br />
* 3rd floor meeting room (presentation in Portuguese)<br />
<br />
== Speaker ==<br />
<br />
* [[Jean-Luc Rouas]]<br />
<br />
== Abstract ==<br />
<br />
At L2F, numerous researches has been achieved on Automatic Speech Recognition (ASR) in Portuguese. The ASR system is currently applied to transcribe broadcast news extracted from a public national channel, the Telejornal on RTP1. The system is now working on an everyday basis and results of the transcription of the last broadcasted evening news is available at http://www.l2f.inesc-id.pt/wiki/index.php/Demos.<br />
<br />
However, one of the problems encountered by the ASR system is the presence of different languages: many interviews are subtitled in Portuguese, while the audio remains in the original language which generates errors. Therefore, to reduce the error rate, the system need to know if the spoken language is really Portuguese or another language. Furthermore, if several ASR systems are available for the most frequent other languages (like English), we will also be able not only to identify the spoken language but also to select the appropriate ASR system.<br />
<br />
In this presentation, I will explain the work I have done at L2F on language identification. After a brief review on automatic language identification systems, I will describes the approaches I implemented during my stay at L2F: the first approach is based on prosodic features, the second one uses acoustic properties of languages, and the last one is based on phonotactics. At last, a basic fusion is described to get an idea of how much improvement can be gained while using all informations together.<br />
<br />
<br />
[[category:Seminars]]<br />
[[category:Seminars 2007]]</div>Lcoheur