Training Paradigms for Correcting Errors in Grammar and Usage

14 years 1 months ago

Download www.aclweb.org

This paper proposes a novel approach to the problem of training classifiers to detect and correct grammar and usage errors in text by selectively introducing mistakes into the training data. When training a classifier, we would like the distribution of examples seen in training to be as similar as possible to the one seen in testing. In error correction problems, such as correcting mistakes made by second language learners, a system is generally trained on correct data, since annotating data for training is expensive. Error generation methods avoid expensive data annotation and create training data that resemble non-native data with errors. We apply error generation methods and train classifiers for detecting and correcting article errors in essays written by non-native English speakers; we show that training on data that contain errors produces higher accuracy when compared to a system that is trained on clean native data. We propose several training paradigms with error generation a...

Alla Rozovskaya, Dan Roth

Real-time Traffic

Computational Linguistics | Error Generation Methods | NAACL 2010 | Native Data | Training |

claim paper

Post Info
More Details (n/a)

Added	14 Feb 2011
Updated	14 Feb 2011
Type	Journal
Year	2010
Where	NAACL
Authors	Alla Rozovskaya, Dan Roth

Comments (0)

Sciweavers

Training Paradigms for Correcting Errors in Grammar and Usage

Computational Linguistics | Error Generation Methods | NAACL 2010 | Native Data | Training |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers