This paper investigates the use of stemming for classification of Dutch (email) texts. We introduce a stemmer, which combines dictionary lookup (implemented efficiently as a finit...
This paper describes the first large-scale evaluation of information retrieval systems using Dutch documents and queries. We describe in detail the characteristics of the Dutch te...
We describe a Named Entity Recognition system for Dutch that combines gazetteers, handcrafted rules, and machine learning on the basis of seed material. We used gazetteers and a c...
In this paper we present the Alpino Dependency Treebank and the tools that we have developed to facilitate the annotation process. Annotation typically starts with parsing a sente...
Leonoor van der Beek, Gosse Bouma, Rob Malouf, Ger...
This paper discusses the influence of the corpus on the automatic identification of proper names in texts. Techniques developed for the newswire genre are generally not sufficient...