Sciweavers

LREC
2008

Semantically Annotated Snapshot of the English Wikipedia

14 years 28 days ago
Semantically Annotated Snapshot of the English Wikipedia
This paper describes SW1, the first version of a semantically annotated snapshot of the English Wikipedia. In recent years Wikipedia has become a valuable resource for both the Natural Language Processing (NLP) community and the Information Retrieval (IR) community. Although NLP technology for processing Wikipedia already exists, not all researchers and developers have the computational resources to process such a volume of information. Moreover, the use of different versions of Wikipedia processed differently might make it difficult to compare results. The aim of this work is to provide easy access to syntactic and semantic annotations for researchers of both NLP and IR communities by building a reference corpus to homogenize experiments and make results comparable. These resources, a semantically annotated corpus and a "entity containment" derived graph, are licensed under the GNU Free Documentation License and available from http://www.yr-bcn.es/semanticWikipedia.
Jordi Atserias, Hugo Zaragoza, Massimiliano Ciaram
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where LREC
Authors Jordi Atserias, Hugo Zaragoza, Massimiliano Ciaramita, Giuseppe Attardi
Comments (0)