

Tagging the Dutch PAROLE Corpus

14 years 3 months ago
Tagging the Dutch PAROLE Corpus
We discuss the annotation with part of speech and lemma of the Dutch PAROLE Internet Corpus. The PAROLE PoS tagger is a combination of statistical taggers. It includes the Markov tagger TnT and 3 taggers developed at the INL1 with the purpose of using other information besides the training data. Lemma is assigned by a deterministic procedure, based on an extensive lexicon. The output is in some respects not entirely satisfactory; we discuss what can be done about this without having to manually correct the complete corpus.
Jesse de Does, John van der Voort van der Kleij
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2001
Where CLIN
Authors Jesse de Does, John van der Voort van der Kleij
Comments (0)