We present a system for allocating resources in shared data and compute clusters that improves MapReduce job scheduling in three ways. First, the system uses regulated and user-as...
As the scale is expanding, node failure becomes a commonplace feature of large-scale cluster systems. As an important part of cluster operating system software, job scheduling tak...
Linping Wu, Dan Meng, Jianfeng Zhan, Wang Lei, Bib...
In this paper we introduce Challenger, a multiagent system that performs completely distributed resource allocation. Challenger consists of agents which individually manage local ...
Distributed clusters like the Grid and PlanetLab enable the same statistical multiplexing efficiency gains for computing as the Internet provides for networking. One major challen...
Kevin Lai, Lars Rasmusson, Eytan Adar, Stephen Sor...
We consider the problem of admission control in resource sharing systems, such as web servers and transaction processing systems, when the job size distribution has high variabili...