Previous anti-spamming algorithms based on link structure suffer from either the weakness of the page value metric or the vagueness of the seed selection. In this paper, we propos...
Despite the widespread use of BM25, there have been few studies examining its effectiveness on a document description over single and multiple field combinations. We determine t...
High findability of documents within a certain cut-off rank is considered an important factor in recall-oriented application domains such as patent or legal document retrieval. ...
One of the most important steps in web crawling is determining the starting points, or seed selection. This paper identifies and explores the problem of seed selection in webscal...
Given an author-conference graph, how do we answer proximity queries (e.g., what are the most related conferences for John Smith?); how can we tailor the search result if the user...