We describe a new, non-FCFS policy to schedule parallel jobs on systems that may be part of a computational grid. Our algorithm continuously monitors the system (i.e., the intensity of incoming jobs and variability of their resource demands), and adapts its scheduling parameters according to workload fluctuations. The proposed policy is based on backfilling, which reduces resource fragmentation by executing jobs in a order different than their arrival order without delaying certain previously submitted jobs. We maintain multiple job queues that effectively separate jobs according to their projected execution time. Our policy supports different job priorities and job reservations, making it appropriate for scheduling jobs on parallel systems that are part of a computational grid. Detailed performance comparisons via simulation using traces from the Parallel Workload Archive indicate that the proposed policy consistently outperforms traditional backfilling.
Barry G. Lawson, Evgenia Smirni