Sciweavers

213 search results - page 3 / 43
» How much of the web is archived
Sort
View
CIDR
2011
243views Algorithms» more  CIDR 2011»
12 years 11 months ago
Longitudinal Analytics on Web Archive Data: It's About Time!
Organizations like the Internet Archive have been capturing Web contents over decades, building up huge repositories of time-versioned pages. The timestamp annotations and the she...
Gerhard Weikum, Nikos Ntarmos, Marc Spaniol, Peter...
ELPUB
1998
ACM
13 years 11 months ago
Research Information Take Away or How to Serve Research Information Fast and Friendly on the Web
In 1997 the library department at the University of Karlskrona/Ronneby was asked to develop a database which could be used to collate and present all the research material and ong...
Peter Linde, Leif Lagebrand
JCDL
2011
ACM
301views Education» more  JCDL 2011»
12 years 10 months ago
Archiving the web using page changes patterns: a case study
A pattern is a model or a template used to summarize and describe the behavior (or the trend) of a data having generally some recurrent events. Patterns have received a considerab...
Myriam Ben Saad, Stéphane Gançarski
SIGMETRICS
2000
ACM
117views Hardware» more  SIGMETRICS 2000»
13 years 7 months ago
Crawler-Friendly Web Servers
In this paper we study how to make web servers (e.g., Apache) more crawler friendly. Current web servers offer the same interface to crawlers and regular web surfers, even though ...
Onn Brandman, Junghoo Cho, Hector Garcia-Molina, N...
CIKM
2009
Springer
14 years 2 months ago
Compact full-text indexing of versioned document collections
We study the problem of creating highly compressed fulltext index structures for versioned document collections, that is, collections that contain multiple versions of each docume...
Jinru He, Hao Yan, Torsten Suel