Near-duplicate detection is not only an important pre and post processing task in Information Retrieval but also an effective spam-detection technique. Among different approache...
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
Queries submitted to a retrieval system are often ambiguous. In such a situation, a sensible strategy is to diversify the ranking of results to be retrieved, in the hope that users...
Rodrygo L. T. Santos, Jie Peng, Craig Macdonald, I...
An essential part of an expert-finding task, such as matching reviewers to submitted papers, is the ability to model the expertise of a person based on documents. We evaluate seve...
Web spamming techniques aim to achieve undeserved rankings in search results. Research has been widely conducted on identifying such spam and neutralizing its influence. However,...