Sciweavers

COLING
2008

Measuring Topic Homogeneity and its Application to Dictionary-Based Word Sense Disambiguation

14 years 1 months ago
Measuring Topic Homogeneity and its Application to Dictionary-Based Word Sense Disambiguation
The use of topical features is abundant in Natural Language Processing (NLP), a major example being in dictionary-based Word Sense Disambiguation (WSD). Yet previous research does not attempt to measure the level of topic cohesion in documents, despite assertions of its effects. This paper introduces a quantitative measure of Topic Homogeneity using a range of NLP resources and not requiring prior knowledge of correct senses. Evaluation is performed firstly by using the WordNet::Domains package to create word-sets with varying levels of homogeneity and comparing our results with those expected. Additionally, to evaluate each measure's potential value, the homogeneity results are correlated against those of 3 co-occurrence/dictionarybased WSD techniques, tested on 1040 Semcor and SENSEVAL sub-documents. Many low-moderate correlations are found to exist with several in the moderate range (above .40). These correlations surpass polysemy and senseentropy, the 2 most cited factors aff...
Ann Gledson, John Keane
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where COLING
Authors Ann Gledson, John Keane
Comments (0)