Sciweavers

COLING
2008

Unsupervised Induction of Labeled Parse Trees by Clustering with Syntactic Features

14 years 27 days ago
Unsupervised Induction of Labeled Parse Trees by Clustering with Syntactic Features
We present an algorithm for unsupervised induction of labeled parse trees. The algorithm has three stages: bracketing, initial labeling, and label clustering. Bracketing is done from raw text using an unsupervised incremental parser. Initial labeling is done using a merging model that aims at minimizing the grammar description length. Finally, labels are clustered to a desired number of labels using syntactic features extracted from the initially labeled trees. The algorithm obtains 59% labeled f-score on the WSJ10 corpus, as compared to 35% in previous work, and substantial error reduction over a random baseline. We report results for English, German and Chinese corpora, using two label mapping methods and two label set sizes.
Roi Reichart, Ari Rappoport
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where COLING
Authors Roi Reichart, Ari Rappoport
Comments (0)