Sciweavers

HPDC
2006
IEEE

Task Scheduling and File Replication for Data-Intensive Jobs with Batch-shared I/O

14 years 5 months ago
Task Scheduling and File Replication for Data-Intensive Jobs with Batch-shared I/O
This paper addresses the problem of efficient execution of a batch of data-intensive tasks with batch-shared I/O behavior, on coupled storage and compute clusters. Two scheduling schemes are proposed: 1) a 0-1 Integer Programming (IP) based approach, which couples task scheduling and data replication, and 2) a bi-level hypergraph partitioning based heuristic approach (BiPartition), which decouples task scheduling and data replication. The experimental results show that: 1) the IP scheme achieves the best batch execution time, but has significant scheduling overhead, thereby restricting its application to small scale workloads, and 2) the BiPartition scheme is a better fit for larger workloads and systems – it has very low scheduling overhead and no more than 5-10% degradation in solution quality, when compared with the IP based approach.
Gaurav Khanna 0002, Nagavijayalakshmi Vydyanathan,
Added 11 Jun 2010
Updated 11 Jun 2010
Type Conference
Year 2006
Where HPDC
Authors Gaurav Khanna 0002, Nagavijayalakshmi Vydyanathan, Ümit V. Çatalyürek, Tahsin M. Kurç, Sriram Krishnamoorthy, P. Sadayappan, Joel H. Saltz
Comments (0)