Sciweavers

HPDC
2002
IEEE

Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications

14 years 5 months ago
Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications
In high energy physics, bioinformatics, and other disciplines, we encounter applications involving numerous, loosely coupled jobs that both access and generate large data sets. So-called Data Grids seek to harness geographically distributed resources for such large-scale data-intensive problems. Yet effective scheduling in such environments is challenging, due to a need to address a variety of metrics and constraints (e.g., resource utilization, response time, global and local allocation policies) while dealing with multiple, potentially independent sources of jobs and a large number of storage, compute, and network resources. We describe a scheduling framework that addresses these problems. Within this framework, data movement operations may be either tightly bound to job scheduling decisions or, alternatively, performed by a decoupled, asynchronous process on the basis of observed data access patterns and load. We develop a family of job scheduling and data movement (replication) al...
Kavitha Ranganathan, Ian T. Foster
Added 14 Jul 2010
Updated 14 Jul 2010
Type Conference
Year 2002
Where HPDC
Authors Kavitha Ranganathan, Ian T. Foster
Comments (0)