Due to the growing importance of the World Wide Web, archiving it has become crucial for preserving useful source of information. To maintain a web archive up-to-date, crawlers ha...
Nowadays, many applications are interested in detecting and discovering changes on the web to help users to understand page updates and more generally, the web dynamics. Web archiv...
In the near future, commodity hardware is expected to incorporate both flash and magnetic disks. In this paper we study how the storage layer of a database system can benefit from...
Wikipedia is a large and rapidly growing Web-based collaborative authoring environment, where anyone on the Internet can create, modify, and delete pages about encyclopedic topics...
Web-based applications are typically required to be highly customizable and configurable. New application requirements have to be introduced rapidly, often without stopping the ru...
The Web can be naturally modeled as a directed graph, consisting of a set of abstract nodes (the pages) joined by directional edges (the hyperlinks). Hyperlinks encode a considerab...
There have been many attempts to study the content of the web, either through human or automatic agents. Five different previously used web survey methodologies are described and ...
Digital preservation of newspaper archives aims both at the salvation of endangered material (paper) and at the creation of digital library services that will allow full utilizatio...
Basilios Gatos, S. L. Mantzaris, Stavros J. Perant...
ost abstract sense, we build web pages so that computers can read them. The software that people use to access web pages is what "reads" the document. How the page is ren...