Sciweavers

LREC
2008

Tools for Collocation Extraction: Preferences for Active vs. Passive

14 years 1 months ago
Tools for Collocation Extraction: Preferences for Active vs. Passive
We present and partially evaluate procedures for the extraction of noun+verb collocation candidates from German text corpora, along with their morphosyntactic preferences, especially for the active vs. passive voice. We start from tokenized, tagged, lemmatized and chunked text, and we use extraction patterns formulated in the CQP corpus query language. We discuss the results of a precision evaluation, on administrative texts from the European Union: we find a considerable amount of specialized collocations, as well as general ones and complex predicates; overall the precision is considerably higher than that of a statistical extractor used as a baseline.
Ulrich Heid, Marion Weller
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where LREC
Authors Ulrich Heid, Marion Weller
Comments (0)