Packing the most onto your cloud

15 years 8 months ago

Download www.cs.uwaterloo.ca

Parallel dataﬂow programming frameworks such as Map-Reduce are increasingly being used for large scale data analysis on computing clouds. It is therefore becoming important to automatically optimize the performance of these frameworks. In this paper, we deal with one particular optimization problem, namely scheduling sets of Map-Reduce jobs on a cluster of machines. We present a scheduler that takes job characteristics into account and ﬁnds a schedule that minimizes the total completion time of the set of jobs. Our scheduler decides on the number of machines to assign to each job, and it tries to pack as many jobs on the machines as the machine resources can support. To enable ﬂexible assignment of jobs onto machines, we run the Map-Reduce jobs in virtual machines. Our scheduling problem is formulated as a constrained optimization problem, and we experimentally demonstrate using the Hadoop open source Map-Reduce implementation that the solution to this problem results in beneﬁ...

Ashraf Aboulnaga, Ziyu Wang, Zi Ye Zhang

Real-time Traffic

CIKM 2009 | Database | Map-Reduce Jobs | Optimization Problem | Particular Optimization Problem |

claim paper

Post Info
More Details (n/a)

Added	26 May 2010
Updated	26 May 2010
Type	Conference
Year	2009
Where	CIKM
Authors	Ashraf Aboulnaga, Ziyu Wang, Zi Ye Zhang

Comments (0)

Sciweavers

Packing the most onto your cloud

CIKM 2009 | Database | Map-Reduce Jobs | Optimization Problem | Particular Optimization Problem |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers