: Difference between revisions

Revision as of 12:21, 17 February 2006

Academia Militar

Integrated Tools and Ontologies

09:30 - Presentation by Joana Paulo.

Information available at http://l2f.l2f.inesc-id.pt/ (intranet)

Integrated tools:
- ATA
- JaVaLi!
- DID
- SAF

3^rd Party
- Intex

Ontologies:
- OntoWine (wine domain ontology)
- OntoChef (cooking domain ontology)

Lexicons

09:50 - Presentation by Ricardo Daniel Ribeiro.

Information available at http://lrdb.l2f.inesc-id.pt/ (intranet)

PAROLE/SIMPLE: 20k root forms + inflection paradigms (morphology + syntax + semantics)
LUSOlex: 65k root forms (morphology + gramcat)
BRASILex: 68k root forms (morphology + gramcat)
Integração do LUSOlex + EPLexIC: ~8-10x EPLexIC phonetic forms
DicPro: 6.2k anthroponyms
SMorph: 26k root forms (morphology + inflection paradigm)
EPLexIC: 80k word forms (morphology + pronunciation); in construction
ONOMASTICA: 85k proper names (people, streets, cities, companies); 11 languages and cross-lingual information; pronunciation
Broadcast News: 64k entries (pronunciation)

Corpora

10:10 - Presentation by Paula Cristina Vaz.

Information available at http://corpora.l2f.inesc-id.pt/ (intranet)

CETENFolha: 24Mwords (newspaper corpus)
CETEMPúblico: 180 Mwords (newspaper corpus)
CHInf: 100 children stories (books)
Newspapers: 10 daily newspapers; ~600 Mwords
PAROLE: ~20 Mwords

Coffee Break and Welcome Reception

10:30 - Welcome reception by General Carlos Carvalho dos Reis.

Spoken Language Corpora

11:00 - Presentation by Rui Amaral.

Information available at http://corpora.l2f.inesc-id.pt/ (intranet)
List of presented corpora:

EUROM.1
BDFALA: newspapers and TV debates
SPEECHDAT
CORAL
ALERT-ASR
ALERT-TD: TV broadcast news
IPSOM: six spoken books (read by professionals); discussion regarding publication and distribution rights
LECTRA: classroom lectures (pilot corpus); two semesters (under construction)
PAPOUS: corpus of children stories performed by António Rito Silva (falsetto voice)

Simple Text Processing Tools

11:20 - Presentation by Fernando Batista.

Information available at http://l2f.l2f.inesc-id.pt/ (intranet)

Morphological analysis
- SMorph - POS tagger, tokenizer, generator
- Palavroso - POS tagger, tokenizer, generator
- Amorfo/XA - POS tagger, tokenizer, simultaneous multi-lingual analysis; error-correction (spelling correction)

Morphological generation
- Monge - general form generator; language- and tag-independent; uses LRDB (under development; usable)
- Gover - verb gnerator (~10k manually corrected verbs)

Morpho-syntax processing
- PAsMo - rule-based rewriter
- MARv - morpho-syntactic disambiguation

Syntactic analysis
- SuSAna -
- ParVO - syntactic analyzer (Earley algorithm; variable unification; O(n³))

Syntax-Semantics interface
- Algas - arrowing construction
- AsDeCopas -

Other tools
- text2syl - silabification
- num2ext - text normalizer
- YAH - (yet another) hyphenator (rule-based); MS Office compatible
- Correcto - spell checker; MS Office compatible
- leia - grapheme-2-phone converter (normalizer)

General purpose
- FSTK lib - finite-state transduce toolkit

Speech Synthesis Tools

11:40 - Presentation by Sérgio Paulo.

Information available at http://l2f.l2f.inesc-id.pt/ (intranet)

EmoVoice - transformation of speech-based emotions
L2F_MuLA - multi-level speech aligner and annotator
L2F_PhoneAlign - phonetic aligner
dixi-tok2wrd - normalizer

Speech Recognition Tools

12:00 - Presentation by Hugo Meinedo.

Information available at http://l2f.l2f.inesc-id.pt/ (intranet)

AUDIMUS - ASR with well-documented API; AUDIMUS.linux (frozen; discontinued; no longer supported; usable); AUDIMUS.cvs (usable; development version); MS Office integration; multi-platform

Brain-Stormming Presentation

12:20 - Presentation by Luís Caldas de Oliveira

Ideas for new projects: 4~5 ideas to be detailed/discussed in the afternoon session

@@ Line 1: / Line 1: @@
-[[Image:logo-academia-militar.gif|346px|right|thumb|[http://www.academiamilitar.pt/ Academia Militar]]]
+{| align='right'
+! style='border-style: solid; border-width: 1px;' | [[Image:logo-academia-militar.gif|346px|right]]<br/>
+[http://www.academiamilitar.pt/ Academia Militar]
+|}
 == Integrated Tools and Ontologies ==
@@ Line 123: / Line 127: @@
 * AUDIMUS - ASR with well-documented API; AUDIMUS.linux (frozen; discontinued; no longer supported; usable); AUDIMUS.cvs (usable; development version); MS Office integration; multi-platform
+== Brain-Stormming Presentation ==
+* 12:20 - Presentation by [[Luís Caldas de Oliveira]]
+* Ideas for new projects: 4~5 ideas to be detailed/discussed in the afternoon session
+*

: Difference between revisions

From HLT@INESC-ID

Revision as of 12:21, 17 February 2006

Contents

Integrated Tools and Ontologies

Lexicons

Corpora

Coffee Break and Welcome Reception

Spoken Language Corpora

Simple Text Processing Tools

Speech Synthesis Tools

Speech Recognition Tools

Brain-Stormming Presentation