Sciweavers

COLING
2010

Towards an optimal weighting of context words based on distance

13 years 6 months ago
Towards an optimal weighting of context words based on distance
Word Sense Disambiguation (WSD) often relies on a context model or vector constructed from the words that co-occur with the target word within the same text windows. In most cases, a fixed-sized window is used, which is determined by trial and error. In addition, words within the same window are weighted uniformly regardless to their distance to the target word. Intuitively, it seems more reasonable to assign a stronger weight to context words closer to the target word. However, it is difficult to manually define the optimal weighting function based on distance. In this paper, we propose a unsupervised method for determining the optimal weights for context words according to their distance. The general idea is that the optimal weights should maximize the similarity of two context models of the target word generated from two random samples. This principle is applied to both English and Japanese. The context models using the resulting weights are used in WSD tasks on Semeval data. Our e...
Bernard Brosseau-Villeneuve, Jian-Yun Nie, Noriko
Added 13 May 2011
Updated 13 May 2011
Type Journal
Year 2010
Where COLING
Authors Bernard Brosseau-Villeneuve, Jian-Yun Nie, Noriko Kando
Comments (0)