Measuring historical word sense variation

14 years 10 months ago

Download www.perseus.tufts.edu

We describe here a method for automatically identifying word sense variation in a dated collection of historical books in a large digital library. By leveraging a small set of known translation book pairs to induce a bilingual sense inventory and labeled training data for a WSD classiﬁer, we are able to automatically classify the Latin word senses in a 389 million word corpus and track the rise and fall of those senses over a span of two thousand years. We evaluate the performance of seven diﬀerent classiﬁers both in a tenfold test on 83,892 words from the aligned parallel corpus and on a smaller, manually annotated sample of 525 words, measuring both the overall accuracy of each system and how well that accuracy correlates (via mean square error) to the observed historical variation. Categories and Subject Descriptors H.3.7 [Information Systems: Information Storage and Retrieval]: digital libraries General Terms Design, Documentation, Performance Keywords Word sense disambiguat...

David Bamman, Gregory Crane

Real-time Traffic

Education | JCDL 2011 | Mean Square Error | Sense Variation | Word Sense Disambiguation |

claim paper

» Estimating Upper and Lower Bounds on the Performance of WordSense Disambiguation Programs

» Modeling Sense Disambiguation of Human Pose Recognizing Action at a Distance by Key Poses

» Global optimal attitude estimation using uncertainty ellipsoids

» Important Moments in Systems and Control

Post Info
More Details (n/a)

Added	15 Sep 2011
Updated	15 Sep 2011
Type	Journal
Year	2011
Where	JCDL
Authors	David Bamman, Gregory Crane

Comments (0)

Sciweavers

Measuring historical word sense variation

Education | JCDL 2011 | Mean Square Error | Sense Variation | Word Sense Disambiguation |

Explore & Download

Productivity Tools

Sciweavers