Semi-supervised Word Sense Disambiguation Using the Web as Corpus

16 years 1 months ago

Download ccc.inaoep.mx

Abstract. As any other classification task, Word Sense Disambiguation requires a large number of training examples. These examples, which are easily obtained for most of the tasks, are particularly difficult to obtain for this case. Based on this fact, in this paper we investigate the possibility of using a Webbased approach for determining the correct sense of an ambiguous word based only in its surrounding context. In particular, we propose a semi-supervised method that is specially suited to work with just a few training examples. The method considers the automatic extraction of unlabeled examples from the Web and their iterative integration into the training data set. The experimental results, obtained over a subset of ten nouns from the SemEval lexical sample task, are encouraging. They showed that it is possible to improve the baseline accuracy of classifiers such as Naïve Bayes and SVM using some unlabeled examples extracted from the Web.

Rafael Guzmán-Cabrera, Paolo Rosso, Manuel

Real-time Traffic

CICLING 2009 | Natural Language Processing | Training Examples | Unlabeled Examples | Word Sense Disambiguation |

claim paper

» Word Sense Disambiguation by Web Mining for Word Cooccurrence Probabilities

» Word Sense Disambiguation Using Heterogeneous Language Resources

» Statistical CorpusBased Word Sense Disambiguation Pseudowords vs Real Ambiguous Words

» Graphbased Word Clustering using a Web Search Engine

» Unsupervised WSD based on Automatically Retrieved Examples The Importance of Bias

» Using WordNet to Disambiguate Word Senses for Text Classification

» Study of Word Sense Disambiguation System that uses Contextual Features Approach of Combi...

» Word Sense Disambiguation using Optimised Combinations of Knowledge Sources

Post Info
More Details (n/a)

Added	19 May 2010
Updated	19 May 2010
Type	Conference
Year	2009
Where	CICLING
Authors	Rafael Guzmán-Cabrera, Paolo Rosso, Manuel Montes-y-Gómez, Luis Villaseñor Pineda, David Pinto Avendaño

Comments (0)

Sciweavers

Semi-supervised Word Sense Disambiguation Using the Web as Corpus

CICLING 2009 | Natural Language Processing | Training Examples | Unlabeled Examples | Word Sense Disambiguation |

Explore & Download

Productivity Tools

Sciweavers