Topical noise in blogs arises when bloggers digress from the central topical thrust of their blogs. We introduce a method to explicitly incorporate a model of topical noise into a...
Documents in many corpora, such as digital libraries and webpages, contain both content and link information. To explicitly consider the document relations represented by links, i...
The aim of query-based sampling is to obtain a sufficient, representative sample of an underlying (text) collection. Current measures for assessing sample quality are too coarse gr...
Topic distillation is one of the main information needs when users search the Web. In previous approaches to topic distillation, the single page was treated as the basic searching ...
Tao Qin, Tie-Yan Liu, Xu-Dong Zhang, Guang Feng, W...
In the area of image retrieval, post-retrieval processing is often used to refine the retrieval results to better satisfy users’ requirements. Previous methods mainly focus on p...