Contextual Word Similarity and Estimation from Sparse Data

15 years 9 months ago

Download www.cs.technion.ac.il

In recent years there is much interest in word cooccurrence relations, such as n-grams, verb-object combinations, or cooccurrence within a limited context. This paper discusses how to estimate the likelihood of cooccurrences that do not occur in the training data. We present a method that makes local analogies between each speci c unobserved cooccurrence and other cooccurrences that contain similar words. These analogies are based on the assumption that similar word cooccurrences have similar values of mutual information. Accordingly, the word similarity metric captures similarities between vectors of mutual information values. Our evaluation suggests that this method performs better than existing, frequency based, smoothing methods, and may provide an alternative to class based models. A background survey is included, covering issues of lexical cooccurrence, data sparseness and smoothing, word similarity and clustering, and mutual information. 1

Ido Dagan, Shaul Marcus, Shaul Markovitch

Real-time Traffic

ACL 1993 | ACL 2007 | Mutual Information | Similar Word | Word Cooccurrences |

claim paper

Post Info
More Details (n/a)

Added	02 Nov 2010
Updated	02 Nov 2010
Type	Conference
Year	1993
Where	ACL
Authors	Ido Dagan, Shaul Marcus, Shaul Markovitch

Comments (0)

Sciweavers

Contextual Word Similarity and Estimation from Sparse Data

ACL 1993 | ACL 2007 | Mutual Information | Similar Word | Word Cooccurrences |

Explore & Download

Productivity Tools

Sciweavers