Sciweavers

LREC
2008

Using a Probabilistic Model of Context to Detect Word Obfuscation

14 years 1 months ago
Using a Probabilistic Model of Context to Detect Word Obfuscation
This paper proposes a distributional model of word use and word meaning which is derived purely from a body of text, and then applies this model to determine whether certain words are used in or out of context. We suggest that we can view the contexts of words as multinomially distributed random variables. We illustrate how using this basic idea, we can formulate the problem of detecting whether or not a word is used in context as a likelihood ratio test. We also define a measure of semantic relatedness between a word and its context using the same model. We assume that words that typically appear together are related, and thus have similar probability distributions and that words used in an unusual way will have probability distributions which are dissimilar from those of their surrounding context. The relatedness of a word to its context is based on Kullback-Leibler divergence between probability distributions assigned to the constituent words in the given sentence. We employed our ...
Sanaz Jabbari, Ben Allison, Louise Guthrie
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where LREC
Authors Sanaz Jabbari, Ben Allison, Louise Guthrie
Comments (0)