In this paper, we study the problem of constructing and maintaining a large shared repository of web pages. We discuss the unique characteristics of such a repository, propose an ...
Jun Hirai, Sriram Raghavan, Hector Garcia-Molina, ...
In prior work we have demonstrated that search engine caches and archiving projects like the Internet Archive’s Wayback Machine can be used to “lazily preserve” websites and...
We describe our experimental rhetoric engine Vox Populi that generates biased video-sequences from a repository of video interviews and other related audio-visual web sources. Use...
The availability of repositories of Web service descriptions enables interesting forms of dynamic Web service discovery, such as searching for Web services exposing a specified beh...
Search engines provide search results based on a large repository of pages downloaded by a web crawler from several servers. To provide best results, this repository must be kept ...