Measuring Topic Homogeneity and its Application to Dictionary-Based Word Sense Disambiguation

15 years 8 months ago

Download www.aclweb.org

The use of topical features is abundant in Natural Language Processing (NLP), a major example being in dictionary-based Word Sense Disambiguation (WSD). Yet previous research does not attempt to measure the level of topic cohesion in documents, despite assertions of its effects. This paper introduces a quantitative measure of Topic Homogeneity using a range of NLP resources and not requiring prior knowledge of correct senses. Evaluation is performed firstly by using the WordNet::Domains package to create word-sets with varying levels of homogeneity and comparing our results with those expected. Additionally, to evaluate each measure's potential value, the homogeneity results are correlated against those of 3 co-occurrence/dictionarybased WSD techniques, tested on 1040 Semcor and SENSEVAL sub-documents. Many low-moderate correlations are found to exist with several in the moderate range (above .40). These correlations surpass polysemy and senseentropy, the 2 most cited factors aff...

Ann Gledson, John Keane

Real-time Traffic

COLING 2008 | Computational Linguistics | Correlations Surpass Polysemy | Measure Achieves Correlations | Topic Homogeneity |

claim paper

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	COLING
Authors	Ann Gledson, John Keane

Sciweavers

Measuring Topic Homogeneity and its Application to Dictionary-Based Word Sense Disambiguation

COLING 2008 | Computational Linguistics | Correlations Surpass Polysemy | Measure Achieves Correlations | Topic Homogeneity |

Explore & Download

Productivity Tools

Sciweavers