Sciweavers

8301 search results - page 1505 / 1661
» Risk-Aware Information Retrieval
Sort
View
137
Voted
WWW
2007
ACM
16 years 6 months ago
Efficient search engine measurements
We address the problem of measuring global quality metrics of search engines, like corpus size, index freshness, and density of duplicates in the corpus. The recently proposed est...
Ziv Bar-Yossef, Maxim Gurevich
WWW
2007
ACM
16 years 6 months ago
Brand awareness and the evaluation of search results
We investigate the effect of search engine brand (i.e., the identifying name or logo that distinguishes a product from its competitors) on evaluation of system performance. This r...
Bernard J. Jansen, Mimi Zhang, Ying Zhang
WWW
2007
ACM
16 years 6 months ago
On anonymizing query logs via token-based hashing
In this paper we study the privacy preservation properties of a specific technique for query log anonymization: tokenbased hashing. In this approach, each query is tokenized, and ...
Ravi Kumar, Jasmine Novak, Bo Pang, Andrew Tomkins
151
Voted
WWW
2007
ACM
16 years 6 months ago
A new suffix tree similarity measure for document clustering
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
Hung Chim, Xiaotie Deng
WWW
2007
ACM
16 years 6 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
« Prev « First page 1505 / 1661 Last » Next »