Learning Accurate, Compact, and Interpretable Tree Annotation

14 years 4 months ago

Download leon.barrettnexus.com

We present an automatic approach to tree annotation in which basic nonterminal symbols are alternately split and merged to maximize the likelihood of a training treebank. Starting with a simple Xbar grammar, we learn a new grammar whose nonterminals are subsymbols of the original nonterminals. In contrast with previous work, we are able to split various terminals to different degrees, as appropriate to the actual complexity in the data. Our grammars automatically learn the kinds of linguistic distinctions exhibited in previous work on manual tree annotation. On the other hand, our grammars are much more compact and substantially more accurate than previous work on automatic annotation. Despite its simplicity, our best grammar achieves an F1 of 90.2% on the Penn Treebank, higher than fully lexicalized systems.

Slav Petrov, Leon Barrett, Romain Thibaux, Dan Kle

Real-time Traffic

ACL 2006 | ACL 2007 | Grammar | Simple Xbar Grammar | Tree Annotation |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2006
Where	ACL
Authors	Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein

Comments (0)

Sciweavers

Learning Accurate, Compact, and Interpretable Tree Annotation

ACL 2006 | ACL 2007 | Grammar | Simple Xbar Grammar | Tree Annotation |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers