Sciweavers

2423 search results - page 162 / 485
» Hypertext Information Retrieval for the Web
Sort
View
WWW
2004
ACM
14 years 10 months ago
Design of a crawler with bounded bandwidth
This paper presents an algorithm to bound the bandwidth of a Web crawler. The crawler collects statistics on the transfer rate of each server to predict the expected bandwidth use...
Michelangelo Diligenti, Marco Maggini, Filippo Mar...
CIKM
2005
Springer
14 years 2 months ago
Implicit user modeling for personalized search
Information retrieval systems (e.g., web search engines) are critical for overcoming information overload. A major deficiency of existing retrieval systems is that they generally...
Xuehua Shen, Bin Tan, ChengXiang Zhai
ECIR
2007
Springer
13 years 10 months ago
Multinomial Randomness Models for Retrieval with Document Fields
Document fields, such as the title or the headings of a document, offer a way to consider the structure of documents for retrieval. Most of the proposed approaches in the literatu...
Vassilis Plachouras, Iadh Ounis
WWW
2005
ACM
14 years 10 months ago
Thresher: automating the unwrapping of semantic content from the World Wide Web
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...
Andrew Hogue, David R. Karger
SIGIR
2009
ACM
14 years 3 months ago
A comparison of retrieval-based hierarchical clustering approaches to person name disambiguation
This paper describes a simple clustering approach to person name disambiguation of retrieved documents. The methods are based on standard IR concepts and do not require any task-s...
Christof Monz, Wouter Weerkamp