Sciweavers

EMNLP
2004

Unsupervised Domain Relevance Estimation for Word Sense Disambiguation

14 years 1 months ago
Unsupervised Domain Relevance Estimation for Word Sense Disambiguation
This paper presents Domain Relevance Estimation (DRE), a fully unsupervised text categorization technique based on the statistical estimation of the relevance of a text with respect to a certain category. We use a pre-defined set of categories (we call them domains) which have been previously associated to WORDNET word senses. Given a certain domain, DRE distinguishes between relevant and non-relevant texts by means of a Gaussian Mixture model that describes the frequency distribution of domain words inside a large-scale corpus. Then, an Expectation Maximization algorithm computes the parameters that maximize the likelihood of the model on the empirical data. The correct identification of the domain of the text is a crucial point for Domain Driven Disambiguation, an unsupervised Word Sense Disambiguation (WSD) methodology that makes use of only domain information. Therefore, DRE has been exploited and evaluated in the context of a WSD task. Results are comparable to those of state-oft...
Alfio Massimiliano Gliozzo, Bernardo Magnini, Carl
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2004
Where EMNLP
Authors Alfio Massimiliano Gliozzo, Bernardo Magnini, Carlo Strapparava
Comments (0)