Sciweavers

EUROSYS
2010
ACM

Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling

14 years 8 months ago
Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling
As organizations start to use data-intensive cluster computing systems like Hadoop and Dryad for more applications, there is a growing need to share clusters between users. However, there is a conflict between fairness in scheduling and data locality (placing tasks on nodes that contain their input data). We illustrate this problem through our experience designing a fair scheduler for a 600-node Hadoop cluster at Facebook. To address the conflict between locality and fairness, we propose a simple algorithm called delay scheduling: when the job that should be scheduled next according to fairness cannot launch a local task, it waits for a small amount of time, letting other jobs launch tasks instead. We find that delay scheduling achieves nearly optimal data locality in a variety of workloads and can increase throughput by up to 2x while preserving fairness. In addition, the simplicity of delay scheduling makes it applicable under a wide variety of scheduling policies beyond fair sha...
Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma
Added 10 Mar 2010
Updated 10 Mar 2010
Type Conference
Year 2010
Where EUROSYS
Authors Matei Zaharia, Dhruba Borthakur, Joydeep Sen Sarma, Khaled Elmeleegy, Scott Shenker, Ion Stoica
Comments (0)