We describe a compression model for semistructured documents, called Structural Contexts Model (SCM), which takes advantage of the context information usually implicit in the stru...
The classical (ad hoc) document retrieval problem has been traditionally approached through ranking according to heuristically developed functions (such as tf.idf or bm25) or gene...
We investigate the task of finding links from Wikipedia pages to external web pages. Such external links significantly extend the information in Wikipedia with information from ...
Sentiment classification is the task of labeling a review document according to the polarity of its prevailing opinion (favorable or unfavorable). In approaching this problem, a m...
Philip Beineke, Trevor Hastie, Shivakumar Vaithyan...
As a principled approach to capturing semantic relations of words in information retrieval, statistical translation models have been shown to outperform simple document language m...