Bootstrapping statistical parsers from small datasets

15 years 8 months ago

Download www.cs.sfu.ca

We present a practical co-training method for bootstrapping statistical parsers using a small amount of manually parsed training material and a much larger pool of raw sentences. Experimental results show that unlabelled sentences can be used to improve the performance of statistical parsers. In addition, we consider the problem of bootstrapping parsers when the manually parsed training material is in a different domain to either the raw sentences or the testing material. We show that bootstrapping continues to be useful, even though no manually produced parses from the target domain are used.

Mark Steedman, Anoop Sarkar, Miles Osborne, Rebecc

Real-time Traffic

EACL 2003 | Natural Language Processing | Parsed Training Material | Raw Sentences | Statistical Parsers |

claim paper

» Sampling Representative Examples for Dimensionality Reduction and Recognition Bootstrap B...

» Combining discriminative reranking and cotraining for parsing Mandarin speech transcripts

» Structural Learning of Activities from Sparse Datasets

» New resampling method for evaluating stability of clusters

» Unsupervised Parse Selection for HPSG

» Learning RobotEnvironment Interaction Using Echo State Networks

» Dependency treelet translation the convergence of statistical and examplebased machinetran...

» How Many Bootstrap Replicates Are Necessary

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2003
Where	EACL
Authors	Mark Steedman, Anoop Sarkar, Miles Osborne, Rebecca Hwa, Stephen Clark, Julia Hockenmaier, Paul Ruhlen, Steven Baker, Jeremiah Crim

Comments (0)

Sciweavers

Bootstrapping statistical parsers from small datasets

EACL 2003 | Natural Language Processing | Parsed Training Material | Raw Sentences | Statistical Parsers |

Explore & Download

Productivity Tools

Sciweavers