Task Scheduling and File Replication for Data-Intensive Jobs with Batch-shared I/O

14 years 6 months ago

Download graal.ens-lyon.fr

This paper addresses the problem of efﬁcient execution of a batch of data-intensive tasks with batch-shared I/O behavior, on coupled storage and compute clusters. Two scheduling schemes are proposed: 1) a 0-1 Integer Programming (IP) based approach, which couples task scheduling and data replication, and 2) a bi-level hypergraph partitioning based heuristic approach (BiPartition), which decouples task scheduling and data replication. The experimental results show that: 1) the IP scheme achieves the best batch execution time, but has signiﬁcant scheduling overhead, thereby restricting its application to small scale workloads, and 2) the BiPartition scheme is a better ﬁt for larger workloads and systems – it has very low scheduling overhead and no more than 5-10% degradation in solution quality, when compared with the IP based approach.

Gaurav Khanna 0002, Nagavijayalakshmi Vydyanathan,

Real-time Traffic

Distributed And Parallel Computing | HPDC 2006 | Scheduling Overhead | Signiﬁcant Scheduling Overhead | Task Scheduling |

claim paper

Post Info
More Details (n/a)

Added	11 Jun 2010
Updated	11 Jun 2010
Type	Conference
Year	2006
Where	HPDC
Authors	Gaurav Khanna 0002, Nagavijayalakshmi Vydyanathan, Ümit V. Çatalyürek, Tahsin M. Kurç, Sriram Krishnamoorthy, P. Sadayappan, Joel H. Saltz

Comments (0)

Sciweavers

Task Scheduling and File Replication for Data-Intensive Jobs with Batch-shared I/O

Distributed And Parallel Computing | HPDC 2006 | Scheduling Overhead | Signiﬁcant Scheduling Overhead | Task Scheduling |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers