Sciweavers

CICLING
2009
Springer

Semi-supervised Word Sense Disambiguation Using the Web as Corpus

14 years 6 months ago
Semi-supervised Word Sense Disambiguation Using the Web as Corpus
Abstract. As any other classification task, Word Sense Disambiguation requires a large number of training examples. These examples, which are easily obtained for most of the tasks, are particularly difficult to obtain for this case. Based on this fact, in this paper we investigate the possibility of using a Webbased approach for determining the correct sense of an ambiguous word based only in its surrounding context. In particular, we propose a semi-supervised method that is specially suited to work with just a few training examples. The method considers the automatic extraction of unlabeled examples from the Web and their iterative integration into the training data set. The experimental results, obtained over a subset of ten nouns from the SemEval lexical sample task, are encouraging. They showed that it is possible to improve the baseline accuracy of classifiers such as Naïve Bayes and SVM using some unlabeled examples extracted from the Web.
Rafael Guzmán-Cabrera, Paolo Rosso, Manuel
Added 19 May 2010
Updated 19 May 2010
Type Conference
Year 2009
Where CICLING
Authors Rafael Guzmán-Cabrera, Paolo Rosso, Manuel Montes-y-Gómez, Luis Villaseñor Pineda, David Pinto Avendaño
Comments (0)