EVNE2013

From HLT@INESC-ID

Revision as of 16:55, 9 October 2013 by Ldsm (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

EVNE2013 - Economic and Violent Named-Event corpus

Version 1.0 2013 Created by Luis Marujo (lmarujo@cs.cmu.edu)

EVNE2013 has 100 news articles covering 10 violent and economic event types: Armed Clashes, Bankruptcy, Change of CEO, Legal Trouble, Mergers, Sex abuse, Street protest, Strike, Suicide Bombing, and Terrorism Bombing. The news articles were collected from 63 sources.

Download

EVNE2013

Data Format

For each of the 100 news articles we provide the following information per line: URL: Source: Labels: (each label matches one sentence)

We have the following event types and corresponding labels:

Armed Clashes: AC Bankruptcy: B

Change of CEO: CC

Legal Trouble: L

Mergers: M

Sex abuse: S

Street protest: P

Strike (Work): SW

Suicide Bombing: SB

Terrorism Bombing: T

No event or null event: N

To obtain the text documents just run the following script in unix command line or cywgin(Windows): ./downloadFilesScript.sh

Then, we used boilerpipe extractor of text from html e.g.: https://code.google.com/p/boilerpipe/ and Stanford Parser to identify the sentence boundaries.

Further Reading

The corpus is free for non-commercial use. Please contact Luis Marujo for other uses.

Please note that we are not including the news text document due to copyright issues.

Contact the author if you find any problem retrieving and/or aligning the labels with the sentences.