We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
The problem of content-based image and video retrieval with textual queries is often posed as that of visual concept classification, where classifiers for a set of predetermined...
The vast majority of the features used in today’s commercially deployed image search systems employ techniques that are largely indistinguishable from text-document search – t...
We present a Bayesian framework for content-based image retrieval which models the distribution of color and texture features within sets of related images. Given a userspecified ...
Background: Text mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate g...