Sciweavers

CLIN
2003
14 years 27 days ago
Methods for the Extraction of Hungarian Multi-Word Lexemes
This paper describes an experiment on extracting Hungarian multi-word lexemes from a corpus, using statistical methods. Corpus preparation—the addition of POS tags and stems—w...
Balázs Kis, Begoña Villada, Gosse Bo...
CLIN
2003
14 years 27 days ago
Interrupting Constructions in a Rejuvenated Amazon Grammar
This paper reports on the latest rejuvenation of AMAZON, a structuralist parser for Dutch written sentences. Unlike older versions, the new AMAZON parser has been developed in a m...
Carla Schelfhout, Peter-Arno Coppen
CLIN
2003
14 years 27 days ago
On the Statistical Consistency of DOP Estimators
A statistical estimator attempts to guess an unknown probability distribution by analyzing a sample from this distribution. One desirable property of an estimator is that its gues...
Detlef Prescher, Remko Scha, Khalil Sima'an, Andre...
CLIN
2003
14 years 27 days ago
Detection of Plagiarism in Student Essays
This paper presents two methods for automatic detection of plagiarism in student essays, using Dutch text corpora to show their effectiveness. The first method is based on measur...
Hans van Halteren
CLIN
2003
14 years 27 days ago
A Memory-Based Shallow Parser for Spoken Dutch
We describe the development of a Dutch memory-based shallow parser. The availability of large treebanks for Dutch, such as the one provided by the Spoken Dutch Corpus, allows memo...
Sander Canisius, Antal van den Bosch
CLIN
2003
14 years 27 days ago
A Corpus Investigation of PP-fronting in Dutch
A long-standing discussion in Dutch syntax concerns the question whether pp dependents of a noun may be fronted. Although examples which apparently illustrate this pattern can be ...
Gosse Bouma
CLIN
2003
14 years 27 days ago
Reduction of Dutch Sentences for Automatic Subtitling
We compare machine learning approaches for sentence length reduction for automatic generation of subtitles for deaf and hearing-impaired people with a method which relies on hand-...
Erik F. Tjong Kim Sang, Walter Daelemans, Anja H&o...
CLIN
2003
14 years 27 days ago
Incremental Construction of Minimal Sequential Transducers
This paper presents an efficient algorithm for the incremental construction of a minimal acyclic sequential transducer (ST) from a list of input and output strings. The algorithm...
Wojciech Skut
CLIN
2003
14 years 27 days ago
Natural Language Processing in Information Retrieval
Many Natural Language Processing (NLP) techniques have been used in Information Retrieval. The results are not encouraging. Simple methods (stopwording, porter-style stemming, etc...
Thorsten Brants