Sciweavers

LREC
2008

A Corpus for Cross-Document Co-reference

14 years 1 months ago
A Corpus for Cross-Document Co-reference
This paper describes a newly created text corpus of news articles that has been annotated for cross-document co-reference. Being able to robustly resolve references to entities across document boundaries will provide a useful capability for a variety of tasks, ranging from practical information retrieval applications to challenging research in information extraction and natural language understanding. This annotated corpus is intended to encourage the development of systems that can more accurately address this problem. A manual annotation tool was developed that allowed the complete corpus to be searched for likely co-referring entity mentions. This corpus of 257K words links mentions of co-referent people, locations and organizations (subject to some additional constraints). Each of the documents had already been annotated for within-document coreference by the LDC as part of the ACE series of evaluations. The annotation process was bootstrapped with a string-matching-based linking ...
David Day, Janet Hitzeman, Michael L. Wick, Keith
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where LREC
Authors David Day, Janet Hitzeman, Michael L. Wick, Keith Crouch, Massimo Poesio
Comments (0)