Adapting a WSJ-Trained Parser to Grammatically Noisy Text

15 years 8 months ago

Download www.aclweb.org

We present a robust parser which is trained on a treebank of ungrammatical sentences. The treebank is created automatically by modifying Penn treebank sentences so that they contain one or more syntactic errors. We evaluate an existing Penn-treebank-trained parser on the ungrammatical treebank to see how it reacts to noise in the form of grammatical errors. We re-train this parser on the training section of the ungrammatical treebank, leading to an significantly improved performance on the ungrammatical test sets. We show how a classifier can be used to prevent performance degradation on the original grammatical data.

Jennifer Foster, Joachim Wagner, Josef van Genabit

Real-time Traffic

ACL 2008 | Computational Linguistics | Treebank | Ungrammatical Sentences | Ungrammatical Treebank |

claim paper

» From the Texts to the Contexts They Contain A Chain of Linguistic Treatments

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	ACL
Authors	Jennifer Foster, Joachim Wagner, Josef van Genabith

Comments (0)

Sciweavers

Adapting a WSJ-Trained Parser to Grammatically Noisy Text

ACL 2008 | Computational Linguistics | Treebank | Ungrammatical Sentences | Ungrammatical Treebank |

Explore & Download

Productivity Tools

Sciweavers