Query-independent features (also called document priors), such as the number of incoming links to a document, its Page-Rank, or the type of its associated URL, have been successfu...
Web search engines consistently collect information about users interaction with the system: they record the query they issued, the URL of presented and selected documents along w...
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Summarizing web pages have recently gained much attention from researchers. Until now two main types of approaches have been proposed for this task: content- and context-based met...
The user observed latency of retrieving Web documents is one of
limiting factors while using the Internet as an information data source.
Prefetching became important technique ...