Predicting bounds on queuing delay for batch-scheduled parallel machines

16 years 16 days ago

Download pompone.cs.ucsb.edu

Most space-sharing parallel computers presently operated by high-performance computing centers use batch-queuing systems to manage processor allocation. In many cases, users wishing to use these batch-queued resources have accounts at multiple sites and have the option of choosing at which site or sites to submit a parallel job. In such a situation, the amount of time a user’s job will wait in any one batch queue can signiﬁcantly impact the overall time a user waits from job submission to job completion. In this work, we explore a new method for providing end-users with predictions for the bounds on the queuing delay individual jobs will experience. We evaluate this method using batch scheduler logs for distributed-memory parallel machines that cover a 9-year period at 7 large HPC centers. Our results show that it is possible to predict delay bounds reliably for jobs in different queues, and for jobs requesting different ranges of processor counts. Using this information, scienti�...

John Brevik, Daniel Nurmi, Richard Wolski

Real-time Traffic

Distributed And Parallel Computing | Distributed-memory Parallel Machines | Most Space-sharing Parallel | Parallel Job | PPOPP 2006 |

claim paper

» Predicting Bounds on Queuing Delay in Spaceshared Computing Environments

» The Inherent Queuing Delay of Parallel Packet Switches

» Load balancing without regret in the bulletin board model

Post Info
More Details (n/a)

Added	14 Jun 2010
Updated	14 Jun 2010
Type	Conference
Year	2006
Where	PPOPP
Authors	John Brevik, Daniel Nurmi, Richard Wolski

Comments (0)

Sciweavers

Predicting bounds on queuing delay for batch-scheduled parallel machines

Distributed And Parallel Computing | Distributed-memory Parallel Machines | Most Space-sharing Parallel | Parallel Job | PPOPP 2006 |

Explore & Download

Productivity Tools

Sciweavers