Sciweavers

NLPRS
2001
Springer

Statistical Parsing of Dutch using Maximum Entropy Models with Feature Merging

14 years 4 months ago
Statistical Parsing of Dutch using Maximum Entropy Models with Feature Merging
In this project report we describe work in statistical parsing using the maximum entropy technique and the Alpino language analysis system for Dutch. A major difficulty in this domain is the lack of sufficient corpus data available for training. Among other problems, this sparseness of data increases the danger of the model overfitting the training data, making it particularly important that the selection of statistical features upon which to base the model be optimal. To this end we have adapted the notion of feature merging, a means of constructing equivalence classes of statistical features based upon common elements within them. In spite of promising preliminary results, subsequent tests have not enabled us to conclude whether this approach helps the kind of models we are working with.
Tony Mullen, Rob Malouf, Gertjan van Noord
Added 30 Jul 2010
Updated 30 Jul 2010
Type Conference
Year 2001
Where NLPRS
Authors Tony Mullen, Rob Malouf, Gertjan van Noord
Comments (0)