High findability of documents within a certain cut-off rank is considered an important factor in recall-oriented application domains such as patent or legal document retrieval. ...
One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...
Relevance Feedback has proven very effective for improving retrieval accuracy. A difficult yet important problem in all relevance feedback methods is how to optimally balance the...
In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled si...
Soo-Min Kim, Patrick Pantel, Lei Duan, Scott Gaffn...
We have studied the problem of linking event information across different languages without the use of translation systems or dictionaries. The linking is based on interlingua in...