Sciweavers

443 search results - page 15 / 89
» Recycling Course Web Pages for the Semantic Web
Sort
View
WWW
2007
ACM
14 years 8 months ago
Random web crawls
This paper proposes a random Web crawl model. A Web crawl is a (biased and partial) image of the Web. This paper deals with the hyperlink structure, i.e. a Web crawl is a graph, w...
Toufik Bennouas, Fabien de Montgolfier
WWW
2004
ACM
14 years 8 months ago
What's new on the web?: the evolution of the web from a search engine perspective
We seek to gain improved insight into how Web search engines should cope with the evolving Web, in an attempt to provide users with the most up-to-date results possible. For this ...
Alexandros Ntoulas, Junghoo Cho, Christopher Olsto...
WIDM
2005
ACM
14 years 1 months ago
DirectoryRank: ordering pages in web directories
Web Directories are repositories of Web pages organized in a hierarchy of topics and sub-topics. In this paper, we present DirectoryRank, a ranking framework that orders the pages...
Vlassis Krikos, Sofia Stamou, Pavlos Kokosis, Alex...
SIGIR
2004
ACM
14 years 1 months ago
Block-based web search
Multiple-topic and varying-length of web pages are two negative factors significantly affecting the performance of web search. In this paper, we explore the use of page segmentati...
Deng Cai, Shipeng Yu, Ji-Rong Wen, Wei-Ying Ma
WWW
2007
ACM
14 years 8 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma