The productivity of HPC system is determined not only by their performance, but also by their reliability. The conventional method to limit the impact of failures is checkpointing...
Load balancing is a key concern when developing parallel and distributed computing applications. The emergence of computational grids extends this problem, where issues of cross-d...
Junwei Cao, Daniel P. Spooner, Stephen A. Jarvis, ...
— In order for Grids to become relied upon for critical infrastructure and reliable scientific computing, Grid-wide management must be automated so that it is possible in quickly...
Zach Hill, Jonathan C. Rowanhill, Anh Nguyen-Tuong...
We consider the classical problem of scheduling parallel unrelated machines. Each job is to be processed by exactly one machine. Processing job j on machine i requires time pij . ...
This paper examines the issue of dynamically scheduling applications on a wide-area network computing system. We construct a simulation model for wide-area task allocation problem...