Unsupervised Word Sense Disambiguation Using Bilingual Comparable Corpora

15 years 7 months ago

Download acl.ldc.upenn.edu

An unsupervised method for word sense disambiguation using a bilingual comparable corpus was developed. First, it extracts statistically significant pairs of related words from the corpus of each language. Then, aligning pairs of related words translingually, it calculates the correlation between the senses of a first-language polysemous word and the words related to the polysemous word, which can be regarded as clues for determining the most suitable sense. Finally, for each instance of the polysemous word, it selects the sense that maximizes the score, i.e., the sum of the correlations between each sense and the clues appearing in the context of the instance. To overcome both the problem of ambiguity in the translingual alignment of pairs of related words and that of disparity of topical coverage between corpora of different languages, an algorithm for calculating the correlation between senses and clues iteratively was devised. An experiment using Wall Street Journal and Nihon Keiz...

Hiroyuki Kaji, Yasutsugu Morimoto

Real-time Traffic

COLING 2002 | COLING 2008 | Polysemous Words | Related Words | Word Sense Disambiguation |

claim paper

Post Info
More Details (n/a)

Added	17 Dec 2010
Updated	17 Dec 2010
Type	Journal
Year	2002
Where	COLING
Authors	Hiroyuki Kaji, Yasutsugu Morimoto

Comments (0)

Sciweavers

Unsupervised Word Sense Disambiguation Using Bilingual Comparable Corpora

COLING 2002 | COLING 2008 | Polysemous Words | Related Words | Word Sense Disambiguation |

Explore & Download

Productivity Tools

Sciweavers