Sciweavers

EMNLP
2007

Part-of-Speech Tagging for Middle English through Alignment and Projection of Parallel Diachronic Texts

14 years 1 months ago
Part-of-Speech Tagging for Middle English through Alignment and Projection of Parallel Diachronic Texts
We demonstrate an approach for inducing a tagger for historical languages based on existing resources for their modern varieties. Tags from Present Day English source text are projected to Middle English text using alignments on parallel Biblical text. We explore the use of multiple alignment approaches and a bigram tagger to reduce the noise in the projected tags. Finally, we train a maximum entropy tagger on the output of the bigram tagger on the target Biblical text and test it on tagged Middle English text. This leads to tagging accuracy in the low 80’s on Biblical test material and in the 60’s on other Middle English material. Our results suggest that our bootstrapping methods have considerable potential, and could be used to semi-automate an approach based on incremental manual annotation.
Taesun Moon, Jason Baldridge
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2007
Where EMNLP
Authors Taesun Moon, Jason Baldridge
Comments (0)