(→Lexicons) |
|||
(21 intermediate revisions by one other user not shown) | |||
Line 1: | Line 1: | ||
− | == | + | {| align='right' style='border-style: solid; border-width: 1px; background: #f7f8ff;' |
+ | ! style='text-align: center; border-style: solid; border-width: 0px; border-bottom-width: 1px;' | [[Image:logo-academia-militar.gif]] | ||
+ | |- | ||
+ | ! style='text-align: center;' | [http://www.academiamilitar.pt/ Academia Militar] | ||
+ | |} | ||
− | + | L²F Day 2006 took place at the [http://www.academiamilitar.pt Military Academy] in Lisbon. | |
− | + | == Integrated Tools and Ontologies == | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | * | + | * '''09:30''' - Presentation by [[Joana Paulo]]. |
− | + | ||
− | * | + | Information available at http://l2f.l2f.inesc-id.pt/ (intranet) |
− | * | + | |
− | + | * Integrated tools: '''ATA'''; '''JaVaLi!'''; '''DID'''; '''SAF'''; '''Intex''' (3<sup>rd</sup> party) | |
+ | * Ontologies: '''OntoWine''' (wine domain ontology); '''OntoChef''' (cooking domain ontology) | ||
== Lexicons == | == Lexicons == | ||
− | Presentation by [[Ricardo Daniel Ribeiro]]. | + | * '''09:50''' - Presentation by [[Ricardo Daniel Ribeiro]]. |
− | * PAROLE/SIMPLE: 20k root forms + inflection paradigms (morphology + syntax + semantics) | + | Information available at http://lrdb.l2f.inesc-id.pt/ (intranet) |
− | * LUSOlex: 65k root forms (morphology + gramcat) | + | |
− | * BRASILex: 68k root forms (morphology + gramcat) | + | * '''PAROLE'''/'''SIMPLE''': 20k root forms + inflection paradigms (morphology + syntax + semantics) |
+ | * '''LUSOlex''': 65k root forms (morphology + gramcat) | ||
+ | * '''BRASILex''': 68k root forms (morphology + gramcat) | ||
* Integração do LUSOlex + EPLexIC: ~8-10x EPLexIC phonetic forms | * Integração do LUSOlex + EPLexIC: ~8-10x EPLexIC phonetic forms | ||
− | * DicPro: 6.2k anthroponyms | + | * '''DicPro''': 6.2k anthroponyms |
− | * SMorph: 26k root forms (morphology + inflection paradigm) | + | * '''SMorph''': 26k root forms (morphology + inflection paradigm) |
− | * EPLexIC: 80k word forms (morphology + pronunciation) | + | * '''EPLexIC''': 80k word forms (morphology + pronunciation); in construction |
− | * ONOMASTICA: 85k proper names (people, streets, cities, companies); 11 languages and cross-lingual information; pronunciation | + | * '''ONOMASTICA''': 85k proper names (people, streets, cities, companies); 11 languages and cross-lingual information; pronunciation |
− | * Broadcast News: 64k entries (pronunciation) | + | * '''Broadcast News''': 64k entries (pronunciation) |
+ | |||
+ | == Corpora == | ||
+ | |||
+ | * '''10:10''' - Presentation by [[Paula Cristina Vaz]]. | ||
+ | |||
+ | Information available at http://corpora.l2f.inesc-id.pt/ (intranet) | ||
+ | |||
+ | * '''CETENFolha''': 24Mwords (newspaper corpus) | ||
+ | * '''CETEMPúblico''': 180 Mwords (newspaper corpus) | ||
+ | * '''CHInf''': 100 children stories (books) | ||
+ | * '''PAROLE''': ~20 Mwords | ||
+ | * Newspapers: 10 daily newspapers; ~600 Mwords | ||
+ | |||
+ | == Coffee Break and Welcome Reception == | ||
+ | |||
+ | * '''10:30''' - Welcome reception by General Carlos Carvalho dos Reis. | ||
+ | |||
+ | == Spoken Language Corpora == | ||
+ | |||
+ | * '''11:00''' - Presentation by [[Rui Amaral]]. | ||
+ | |||
+ | Information available at http://corpora.l2f.inesc-id.pt/ (intranet)<br/> | ||
+ | List of presented corpora: | ||
+ | |||
+ | * '''EUROM.1''' | ||
+ | * '''BDFALA''': newspapers and TV debates | ||
+ | * '''SPEECHDAT''' | ||
+ | * '''CORAL''' | ||
+ | * '''ALERT-ASR''' | ||
+ | * '''ALERT-TD''': TV broadcast news | ||
+ | * '''IPSOM''': six spoken books (read by professionals); discussion regarding publication and distribution rights | ||
+ | * '''LECTRA''': classroom lectures (pilot corpus); two semesters (under construction) | ||
+ | * '''PAPOUS''': corpus of children stories performed by António Rito Silva (falsetto voice) | ||
+ | |||
+ | == Simple Text Processing Tools == | ||
+ | |||
+ | * '''11:20''' - Presentation by [[Fernando Batista]]. | ||
+ | |||
+ | Information available at http://l2f.l2f.inesc-id.pt/ (intranet) | ||
+ | |||
+ | * Morphological analysis | ||
+ | ** '''SMorph''' - POS tagger, tokenizer, generator | ||
+ | ** '''Palavroso''' - POS tagger, tokenizer, generator | ||
+ | ** '''Amorfo'''/'''XA''' - POS tagger, tokenizer, simultaneous multi-lingual analysis; error-correction (spelling correction) | ||
+ | |||
+ | * Morphological generation | ||
+ | ** '''Monge''' - general form generator; language- and tag-independent; uses LRDB (under development; usable) | ||
+ | ** '''Gover''' - verb gnerator (~10k manually corrected verbs) | ||
+ | |||
+ | * Morpho-syntax processing | ||
+ | ** '''PAsMo''' - rule-based rewriter | ||
+ | ** '''MARv''' - morpho-syntactic disambiguation | ||
+ | |||
+ | * Syntactic analysis | ||
+ | ** '''SuSAna''' - Surface syntax analyzer | ||
+ | ** '''ParVO''' - syntactic analyzer (Earley algorithm; variable unification; O(n³)) | ||
+ | |||
+ | * Syntax-Semantics interface | ||
+ | ** '''Algas''' - arrowing construction | ||
+ | ** '''AsDeCopas''' - | ||
+ | |||
+ | * Other tools | ||
+ | ** '''text2syl''' - silabification | ||
+ | ** '''num2ext''' - text normalizer | ||
+ | ** '''YAH''' - ''(yet another) hyphenator'' (rule-based); MS Office compatible | ||
+ | ** '''Correcto''' - spell checker; MS Office compatible | ||
+ | ** '''leia''' - grapheme-2-phone converter (normalizer) | ||
+ | |||
+ | * General purpose | ||
+ | ** '''FSTK lib''' - finite-state transduce toolkit | ||
+ | |||
+ | == Speech Synthesis Tools == | ||
+ | |||
+ | * '''11:40''' - Presentation by [[Sérgio Paulo]]. | ||
+ | |||
+ | Information available at http://l2f.l2f.inesc-id.pt/ (intranet) | ||
+ | |||
+ | * '''EmoVoice''' - transformation of speech-based emotions | ||
+ | * '''L2F_MuLA''' - multi-level speech aligner and annotator | ||
+ | * '''L2F_PhoneAlign''' - phonetic aligner | ||
+ | * '''dixi-tok2wrd''' - normalizer | ||
+ | |||
+ | == Speech Recognition Tools == | ||
+ | |||
+ | * '''12:00''' - Presentation by [[Hugo Meinedo]]. | ||
+ | |||
+ | Information available at http://l2f.l2f.inesc-id.pt/ (intranet) | ||
+ | |||
+ | * '''AUDIMUS''' - ASR with well-documented API; '''AUDIMUS.linux''' (frozen; discontinued; no longer supported; usable); '''AUDIMUS.cvs''' (usable; development version); MS Office integration; multi-platform | ||
+ | |||
+ | == Brainstorming Presentation == | ||
+ | |||
+ | * '''12:20''' - Presentation by [[Luís Caldas de Oliveira]] | ||
+ | * Ideas for new projects: 4~5 surviving ideas to be detailed/discussed in the afternoon session | ||
+ | |||
+ | == Lunch Break == | ||
+ | |||
+ | * '''12:30''' - Lunch at Paço da Rainha | ||
+ | * '''14:30''' - Visit to the Academia Buildings: museum, library, council chamber, chapel | ||
+ | |||
+ | == Brainstorming == | ||
+ | |||
+ | * '''15:00''' - Moderated by [[Luís Caldas de Oliveira]] | ||
+ | |||
+ | == Coffee Break == | ||
+ | |||
+ | * '''16:45''' | ||
+ | |||
+ | == Analysis == | ||
+ | |||
+ | * '''17:00''' - Moderated by [[Luís Caldas de Oliveira]] | ||
+ | |||
+ | == Final Remarks == | ||
+ | |||
+ | * '''18:15''' - Opportunities for new people: scholarships; post-graduate studies. | ||
+ | |||
+ | [[category:Seminars]] | ||
+ | [[category:Seminars 2006]] | ||
+ | [[category:L²F Day]] |
![]() |
---|
Academia Militar |
L²F Day 2006 took place at the Military Academy in Lisbon.
Information available at http://l2f.l2f.inesc-id.pt/ (intranet)
Information available at http://lrdb.l2f.inesc-id.pt/ (intranet)
Information available at http://corpora.l2f.inesc-id.pt/ (intranet)
Information available at http://corpora.l2f.inesc-id.pt/ (intranet)
List of presented corpora:
Information available at http://l2f.l2f.inesc-id.pt/ (intranet)
Information available at http://l2f.l2f.inesc-id.pt/ (intranet)
Information available at http://l2f.l2f.inesc-id.pt/ (intranet)