Many information retrieval systems use the inverted file as indexing structure. The inverted file, however, is not suited to supporting incremental updates when new documents are ...
We study how best to schedule scans of large data files, in the presence of many simultaneous requests to a common set of files. The objective is to maximize the overall rate of p...
Abstract: File sharing systems cause a huge portion of traffic in the Internet. With respect to the peer-to-peer approach, unicast delivery of content is the common case. Unfortun...
Future large distributed systems will be made by interconnecting highly autonomous subsystems, rather than by building ever more elaborate complexes which attempt to provide a sin...
This paper studies five real-world data intensive workflow applications in the fields of natural language processing, astronomy image analysis, and web data analysis. Data intensiv...