As storage systems scale to thousands of disks, data distribution and load balancing become increasingly important. We present an algorithm for allocating data objects to disks as...
Clustering by document concepts is a powerful way of retrieving information from a large number of documents. This task in general does not make any assumption on the data distrib...
A variety of real-world applications requires a meaningful online analysis of transient data streams. An important building block of many analysis tasks is the characterization of...
The interest among a geographically distributed user base to mine massive collections of scientific data propels the need for efficient data dissemination solutions. An optimal dat...
Program parallelization requires mapping computation and data to processing elements. Navigational Programming (NavP), based on the principle of migrating computations, offers a d...
Lei Pan, Jingling Xue, Ming Kin Lai, Michael B. Di...
In this paper, we describe and evaluate a scalable network of Active Elements (AE) that implements userempowered virtual-multicast overlay network for synchronous data distributio...