Probabilistic QoS Guarantees for Supercomputing Systems

16 years 18 days ago

Download adam.oliner.net

Supercomputing systems must be able to reliably and efﬁciently complete their assigned workloads, even in the presence of failures. This paper proposes a system that allows the system and users to negotiate a mutually desirable risk strategy; in order to accomplish this, the system makes probabilistic guarantees on quality of service (QoS), of the form, “Job j can be completed by deadline d with probability p.” In order to make such guarantees, the system uses event prediction (forecasting) in conjunction with fault-aware job scheduling and cooperative checkpointing strategies. Using job logs and failure traces from actual high performance computing systems, we employ trace-based simulations to assess the effects of the prediction accuracy (a) and user risk strategy (U) on a variety of performance metrics. Compared to a system that does not use event prediction, a high forecasting accuracy resulted in QoS and utilization improvements of as much as 6%, along with an 89% reduction...

Adam J. Oliner, Larry Rudolph, Ramendra K. Sahoo,

Real-time Traffic

Computer Networks | DSN 2005 | Event Prediction | Probabilistic Qos Guarantees | Risk Strategy |

claim paper

» QoS guarantee using probabilistic deadlines

» A Model and a Design Approach to Building QoS Adaptive Systems

» CrossLayer Analysis of the EndtoEnd Delay Distribution in Wireless Sensor Networks

» Analyzing and Minimizing the Impact of Opportunity Cost in QoSaware Job Scheduling

» DVS for bufferconstrained architectures with predictable QoSenergy tradeoffs

» ControlWare A Middleware Architecture for Feedback Control of Software Performance

» A Stochastic Approach to Measuring the Robustness of Resource Allocations in Distributed S...

» Stochastic robustness metric and its use for static resource allocations

Post Info
More Details (n/a)

Added	24 Jun 2010
Updated	24 Jun 2010
Type	Conference
Year	2005
Where	DSN
Authors	Adam J. Oliner, Larry Rudolph, Ramendra K. Sahoo, José E. Moreira, Manish Gupta

Comments (0)

Sciweavers

Probabilistic QoS Guarantees for Supercomputing Systems

Computer Networks | DSN 2005 | Event Prediction | Probabilistic Qos Guarantees | Risk Strategy |

Explore & Download

Productivity Tools

Sciweavers