: Difference between revisions
From HLT@INESC-ID
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
== Integrated Tools and Ontologies == | == Integrated Tools and Ontologies == | ||
Presentation by [[Joana Paulo]]. | * 09:30 - Presentation by [[Joana Paulo]]. | ||
Information available at http://l2f.l2f.inesc-id.pt/ (intranet) | Information available at http://l2f.l2f.inesc-id.pt/ (intranet) | ||
Line 20: | Line 20: | ||
== Lexicons == | == Lexicons == | ||
Presentation by [[Ricardo Daniel Ribeiro]]. | * 09:50 - Presentation by [[Ricardo Daniel Ribeiro]]. | ||
Information available at http://lrdb.l2f.inesc-id.pt/ (intranet) | Information available at http://lrdb.l2f.inesc-id.pt/ (intranet) | ||
Line 36: | Line 36: | ||
== Corpora == | == Corpora == | ||
Presentation by [[Paula Vaz]]. | * 10:10 - Presentation by [[Paula Cristina Vaz]]. | ||
Information available at http://corpora.l2f.inesc-id.pt/ (intranet) | Information available at http://corpora.l2f.inesc-id.pt/ (intranet) | ||
Line 48: | Line 48: | ||
== Coffee Break and Welcome Reception == | == Coffee Break and Welcome Reception == | ||
* Welcome reception by General Carlos Carvalho dos Reis. | * 10:30 - Welcome reception by General Carlos Carvalho dos Reis. | ||
== Spoken Language Corpora == | == Spoken Language Corpora == | ||
Presentation by [[Rui Amaral]]. | * 11:00 - Presentation by [[Rui Amaral]]. | ||
Information available at http://corpora.l2f.inesc-id.pt/ (intranet)<br/> | Information available at http://corpora.l2f.inesc-id.pt/ (intranet)<br/> | ||
Line 58: | Line 58: | ||
* EUROM.1 | * EUROM.1 | ||
* BDFALA: newspapers and TV debates | |||
* SPEECHDAT | |||
* CORAL | |||
* ALERT-ASR | |||
* ALERT-TD: TV broadcast news | |||
* IPSOM: six spoken books (read by professionals); discussion regarding publication and distribution rights | |||
* LECTRA: classroom lectures (pilot corpus); two semesters (under construction) | |||
* PAPOUS: corpus of children stories performed by António Rito Silva (falsetto voice) |
Revision as of 11:27, 17 February 2006
Integrated Tools and Ontologies
- 09:30 - Presentation by Joana Paulo.
Information available at http://l2f.l2f.inesc-id.pt/ (intranet)
- Integrated tools:
- ATA
- JaVaLi!
- DID
- SAF
- 3rd Party
- Intex
- Ontologies:
- OntoWine (wine domain ontology)
- OntoChef (cooking domain ontology)
Lexicons
- 09:50 - Presentation by Ricardo Daniel Ribeiro.
Information available at http://lrdb.l2f.inesc-id.pt/ (intranet)
- PAROLE/SIMPLE: 20k root forms + inflection paradigms (morphology + syntax + semantics)
- LUSOlex: 65k root forms (morphology + gramcat)
- BRASILex: 68k root forms (morphology + gramcat)
- Integração do LUSOlex + EPLexIC: ~8-10x EPLexIC phonetic forms
- DicPro: 6.2k anthroponyms
- SMorph: 26k root forms (morphology + inflection paradigm)
- EPLexIC: 80k word forms (morphology + pronunciation); in construction
- ONOMASTICA: 85k proper names (people, streets, cities, companies); 11 languages and cross-lingual information; pronunciation
- Broadcast News: 64k entries (pronunciation)
Corpora
- 10:10 - Presentation by Paula Cristina Vaz.
Information available at http://corpora.l2f.inesc-id.pt/ (intranet)
- CETENFolha: 24Mwords (newspaper corpus)
- CETEMPúblico: 180 Mwords (newspaper corpus)
- CHInf: 100 children stories (books)
- Newspapers: 10 daily newspapers; ~600 Mwords
- PAROLE: ~20 Mwords
Coffee Break and Welcome Reception
- 10:30 - Welcome reception by General Carlos Carvalho dos Reis.
Spoken Language Corpora
- 11:00 - Presentation by Rui Amaral.
Information available at http://corpora.l2f.inesc-id.pt/ (intranet)
List of presented corpora:
- EUROM.1
- BDFALA: newspapers and TV debates
- SPEECHDAT
- CORAL
- ALERT-ASR
- ALERT-TD: TV broadcast news
- IPSOM: six spoken books (read by professionals); discussion regarding publication and distribution rights
- LECTRA: classroom lectures (pilot corpus); two semesters (under construction)
- PAPOUS: corpus of children stories performed by António Rito Silva (falsetto voice)