Sciweavers

EMNLP
2007

Bootstrapping Feature-Rich Dependency Parsers with Entropic Priors

14 years 1 months ago
Bootstrapping Feature-Rich Dependency Parsers with Entropic Priors
One may need to build a statistical parser for a new language, using only a very small labeled treebank together with raw text. We argue that bootstrapping a parser is most promising when the model uses a rich set of redundant features, as in recent models for scoring dependency parses (McDonald et al., 2005). Drawing on Abney’s (2004) analysis of the Yarowsky algorithm, we perform bootstrapping by entropy regularization: we maximize a linear combination of conditional likelihood on labeled data and confidence (negative R´enyi entropy) on unlabeled data. In initial experiments, this surpassed EM for training a simple feature-poor generative model, and also improved the performance of a feature-rich, conditionally estimated model where EM could not easily have been applied. For our models and training sets, more peaked measures of confidence, measured by R´enyi entropy, outperformed smoother ones. We discuss how our feature set could be extended with cross-lingual or cross-domain...
David A. Smith, Jason Eisner
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2007
Where EMNLP
Authors David A. Smith, Jason Eisner
Comments (0)