Sciweavers

65 search results - page 11 / 13
» A Failure-Aware Scheduling Strategy in Large-Scale Cluster S...
Sort
View
OSDI
2008
ACM
14 years 8 months ago
Predicting Computer System Failures Using Support Vector Machines
Mitigating the impact of computer failure is possible if accurate failure predictions are provided. Resources, applications, and services can be scheduled around predicted failure...
Errin W. Fulp, Glenn A. Fink, Jereme N. Haack
HPDC
2010
IEEE
13 years 8 months ago
Cluster-wide context switch of virtualized jobs
Clusters are mostly used through Resources Management Systems (RMS) with a static allocation of resources for a bounded amount of time. Those approaches are known to be insufficie...
Fabien Hermenier, Adrien Lebre, Jean-Marc Menaud
CCGRID
2005
IEEE
14 years 1 months ago
User group-based workload analysis and modelling
Knowledge about the workload is an important aspect for scheduling of resources as parallel computers or Grid components. As the scheduling quality highly depends on the character...
Baiyi Song, Carsten Ernemann, Ramin Yahyapour
ICPADS
2006
IEEE
14 years 1 months ago
The Impact of Information Availability and Workload Characteristics on the Performance of Job Co-allocation in Multi-clusters
In this paper, we utilize a bandwidth-centric job communication model that captures the interaction and impact of simultaneously co-allocating jobs across multiple clusters. We ma...
William M. Jones, Walter B. Ligon III, Nishant Shr...
SIGMETRICS
2009
ACM
151views Hardware» more  SIGMETRICS 2009»
14 years 2 months ago
MapReduce optimization using regulated dynamic prioritization
We present a system for allocating resources in shared data and compute clusters that improves MapReduce job scheduling in three ways. First, the system uses regulated and user-as...
Thomas Sandholm, Kevin Lai