Mitigating the impact of computer failure is possible if accurate failure predictions are provided. Resources, applications, and services can be scheduled around predicted failure...
Clusters are mostly used through Resources Management Systems (RMS) with a static allocation of resources for a bounded amount of time. Those approaches are known to be insufficie...
Knowledge about the workload is an important aspect for scheduling of resources as parallel computers or Grid components. As the scheduling quality highly depends on the character...
In this paper, we utilize a bandwidth-centric job communication model that captures the interaction and impact of simultaneously co-allocating jobs across multiple clusters. We ma...
William M. Jones, Walter B. Ligon III, Nishant Shr...
We present a system for allocating resources in shared data and compute clusters that improves MapReduce job scheduling in three ways. First, the system uses regulated and user-as...