Sciweavers

416 search results - page 17 / 84
» Structured Web Pages Management for Efficient Data Retrieval
Sort
View
WWW
2004
ACM
14 years 8 months ago
Ranking the web frontier
The celebrated PageRank algorithm has proved to be a very effective paradigm for ranking results of web search algorithms. In this paper we refine this basic paradigm to take into...
Nadav Eiron, Kevin S. McCurley, John A. Tomlin
WWW
2007
ACM
14 years 8 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
WWW
2008
ACM
14 years 8 months ago
Sailer: an effective search engine for unified retrieval of heterogeneous xml and web documents
This paper studies the problem of unified ranked retrieval of heterogeneous XML documents and Web data. We propose an effective search engine called Sailer to adaptively and versa...
Guoliang Li, Jianhua Feng, Jianyong Wang, Xiaoming...
EMNLP
2010
13 years 5 months ago
Storing the Web in Memory: Space Efficient Language Models with Constant Time Retrieval
We present three novel methods of compactly storing very large n-gram language models. These methods use substantially less space than all known approaches and allow n-gram probab...
David Guthrie, Mark Hepple
ECIR
2008
Springer
13 years 9 months ago
The Importance of Link Evidence in Wikipedia
Wikipedia is one of the most popular information sources on the Web. The free encyclopedia is densely linked. The link structure in Wikipedia differs from the Web at large: interna...
Jaap Kamps, Marijn Koolen