Part-of-Speech Tagging for Middle English through Alignment and Projection of Parallel Diachronic Texts

15 years 8 months ago

Download acl.ldc.upenn.edu

We demonstrate an approach for inducing a tagger for historical languages based on existing resources for their modern varieties. Tags from Present Day English source text are projected to Middle English text using alignments on parallel Biblical text. We explore the use of multiple alignment approaches and a bigram tagger to reduce the noise in the projected tags. Finally, we train a maximum entropy tagger on the output of the bigram tagger on the target Biblical text and test it on tagged Middle English text. This leads to tagging accuracy in the low 80’s on Biblical test material and in the 60’s on other Middle English material. Our results suggest that our bootstrapping methods have considerable potential, and could be used to semi-automate an approach based on incremental manual annotation.

Taesun Moon, Jason Baldridge

Real-time Traffic

Biblical Text | Bigram Tagger | EMNLP 2007 | Middle English Text | Natural Language Processing |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2007
Where	EMNLP
Authors	Taesun Moon, Jason Baldridge

Comments (0)

Sciweavers

Part-of-Speech Tagging for Middle English through Alignment and Projection of Parallel Diachronic Texts

Biblical Text | Bigram Tagger | EMNLP 2007 | Middle English Text | Natural Language Processing |

Explore & Download

Productivity Tools

Sciweavers