Accelerating Large-scale Data Exploration through Data Diffusion

15 years 6 months ago

Download people.cs.uchicago.edu

Data-intensive applications often require exploratory analysis of large datasets. If analysis is performed on distributed resources, data locality can be crucial to high throughput and performance. We propose a "data diffusion" approach that acquires compute and storage resources dynamically, replicates data in response to demand, and schedules computations close to data. As demand increases, more resources are acquired, thus allowing faster response to subsequent requests that refer to the same data; when demand drops, resources are released. This approach can provide the benefits of dedicated hardware without the associated high costs, depending on workload and resource characteristics. The approach is reminiscent of cooperative caching, web-caching, and peer-to-peer storage systems, but addresses different application demands. Other data-aware scheduling approaches assume dedicated resources, which can be expensive and/or inefficient if load varies significantly. To explo...

Ioan Raicu, Yong Zhao, Ian T. Foster, Alexander S.

Real-time Traffic

CORR 2008 | Data Diffusion | Data-aware Scheduling | Education | Resource |

claim paper

» Creating Large Scale Database Servers

» A Robust Data Delivery Protocol for Large Scale Sensor Networks

» On the storage management and analysis of multi similarity for large scale protein structu...

» From small scale to large scale user participation a case study of participatory design in...

» LargeScale Simulation of Replica Placement Algorithms for a Serverless Distributed File Sy...

» Accelerating Checkpoint Operation by NodeLevel Write Aggregation on Multicore Systems

» Social SQL Tools for Exploring Social Databases

» Finding Optimal Targets for Change Agents A Computer Simulation of Innovation Diffusion

Post Info
More Details (n/a)

Added	09 Dec 2010
Updated	09 Dec 2010
Type	Journal
Year	2008
Where	CORR
Authors	Ioan Raicu, Yong Zhao, Ian T. Foster, Alexander S. Szalay

Comments (0)

Sciweavers

Accelerating Large-scale Data Exploration through Data Diffusion

CORR 2008 | Data Diffusion | Data-aware Scheduling | Education | Resource |

Explore & Download

Productivity Tools

Sciweavers