Sciweavers

LREC
2010

WikiWoods: Syntacto-Semantic Annotation for English Wikipedia

14 years 27 days ago
WikiWoods: Syntacto-Semantic Annotation for English Wikipedia
WikiWoods is an ongoing initiative to provide rich syntacto-semantic annotations for English Wikipedia. We sketch an automated processing pipeline to extract relevant textual content from Wikipedia sources, segment documents into sentence-like units, parse and disambiguate using a broad-coverage precision grammar, and support the export of syntactic and semantic information in various formats. The full parsed corpus is accompanied by a subset of Wikipedia articles for which gold-standard annotations in the same format were produced manually. This subset was selected to represent a coherent domain, Wikipedia entries on the broad topic of Natural Language Processing.
Dan Flickinger, Stephan Oepen, Gisle Ytrestø
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2010
Where LREC
Authors Dan Flickinger, Stephan Oepen, Gisle Ytrestøl
Comments (0)