110-PT-BN-KP: Difference between revisions
From HLT@INESC-ID
(Created page with "110 Portuguese Broadcast News annotated with Key Phrases by an expert. == Statistics == Train / Test * Number of stories: 100 / 10 * Number of words: 29,225 /3,896 * Average Num...") |
No edit summary |
||
(9 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
110 Portuguese Broadcast News annotated with Key Phrases by an expert. | 110-PT-BN-KP corpus is made of 110 Portuguese Broadcast News annotated with Key Phrases by an expert. | ||
The Broadcast News (BN) were extracted from 8 BN programs from the European Portuguese '''[[ALERT Corpus|ALERT]]''' | |||
== Statistics == | == Statistics == | ||
Line 8: | Line 10: | ||
== Further Reading == | == Further Reading == | ||
The corpus is free for non-commercial use. Please contact Luis Marujo for other uses. | The corpus is free for non-commercial use. | ||
Please cite this paper if you write any paper using the data | |||
Please contact '''Luis Marujo''' for other uses. | |||
Please cite this paper if you write any paper using the data below: | |||
Luis Marujo, Márcio Viveiros, João Paulo da Silva Neto, Keyphrase Cloud Generation of Broadcast News, In proceeding of Interspeech 2011: 12th Annual Conference of the International Speech Communication Association, ISCA, Florence, Italy, August 2011 [http://www.inesc-id.pt/pt/indicadores/Ficheiros/7588.pdf pdf] [http://www.inesc-id.pt/intranet/publicacoes/bibtex.php?bibtex=7588 bibtex] | Luis Marujo, Márcio Viveiros, João Paulo da Silva Neto, '''Keyphrase Cloud Generation of Broadcast News''', In proceeding of Interspeech 2011: 12th Annual Conference of the International Speech Communication Association, ISCA, Florence, Italy, August 2011 [http://www.inesc-id.pt/pt/indicadores/Ficheiros/7588.pdf pdf] [http://www.inesc-id.pt/intranet/publicacoes/bibtex.php?bibtex=7588 bibtex] | ||
== Download == | == Download == | ||
[http://www.l2f.inesc-id.pt/~ldsm/110-PT-BN-KP.zip 110-PT-BN-KP] | [http://www.l2f.inesc-id.pt/~ldsm/110-PT-BN-KP.zip 110-PT-BN-KP] |
Latest revision as of 18:08, 10 September 2013
110-PT-BN-KP corpus is made of 110 Portuguese Broadcast News annotated with Key Phrases by an expert.
The Broadcast News (BN) were extracted from 8 BN programs from the European Portuguese ALERT
Statistics
Train / Test
- Number of stories: 100 / 10
- Number of words: 29,225 /3,896
- Average Number of Key Phrases: 24 / 29
Further Reading
The corpus is free for non-commercial use.
Please contact Luis Marujo for other uses.
Please cite this paper if you write any paper using the data below:
Luis Marujo, Márcio Viveiros, João Paulo da Silva Neto, Keyphrase Cloud Generation of Broadcast News, In proceeding of Interspeech 2011: 12th Annual Conference of the International Speech Communication Association, ISCA, Florence, Italy, August 2011 pdf bibtex