Sciweavers

5284 search results - page 145 / 1057
» Sampling search-engine results
Sort
View
WWW
2006
ACM
14 years 4 months ago
Do not crawl in the DUST: different URLs with similar text
We consider the problem of dust: Different URLs with Similar Text. Such duplicate URLs are prevalent in web sites, as web server software often uses aliases and redirections, and...
Uri Schonfeld, Ziv Bar-Yossef, Idit Keidar
CIKM
2003
Springer
14 years 3 months ago
Using titles and category names from editor-driven taxonomies for automatic evaluation
Evaluation of IR systems has always been difficult because of the need for manually assessed relevance judgments. The advent of large editor-driven taxonomies on the web opens the...
Steven M. Beitzel, Eric C. Jensen, Abdur Chowdhury...
CORR
2002
Springer
121views Education» more  CORR 2002»
13 years 10 months ago
Answering Subcognitive Turing Test Questions: A Reply to French
Robert French has argued that a disembodied computer is incapable of passing a Turing Test that includes subcognitive questions. Subcognitive questions are designed to probe the n...
Peter D. Turney
SOCIALCOM
2010
13 years 8 months ago
Using Text Analysis to Understand the Structure and Dynamics of the World Wide Web as a Multi-Relational Graph
A representation of the World Wide Web as a directed graph, with vertices representing web pages and edges representing hypertext links, underpins the algorithms used by web search...
Harish Sethu, Alexander Yates
ICDE
2009
IEEE
174views Database» more  ICDE 2009»
14 years 5 months ago
Sketching Sampled Data Streams
—Sampling is used as a universal method to reduce the running time of computations – the computation is performed on a much smaller sample and then the result is scaled to comp...
Florin Rusu, Alin Dobra