SkewTune: mitigating skew in mapreduce applications

13 years 10 months ago

Download nuage.cs.washington.edu

We present an automatic skew mitigation approach for userdeﬁned MapReduce programs and present SkewTune, a system that implements this approach as a drop-in replacement for an existing MapReduce implementation. There are three key challenges: (a) require no extra input from the user yet work for all MapReduce applications, (b) be completely transparent, and (c) impose minimal overhead if there is no skew. The SkewTune approach addresses these challenges and works as follows: When a node in the cluster becomes idle, SkewTune identiﬁes the task with the greatest expected remaining processing time. The unprocessed input data of this straggling task is then proactively repartitioned in a way that fully utilizes the nodes in the cluster and preserves the ordering of the input data so that the original output can be reconstructed by concatenation. We implement SkewTune as an extension to Hadoop and evaluate its eﬀectiveness using several real applications. The results show that SkewTu...

YongChul Kwon, Magdalena Balazinska, Bill Howe, Je

Real-time Traffic

Database | Database Management Systems | Key Challenges | Parallel Databases | SIGMOD 2012 |

claim paper

Post Info
More Details (n/a)

Added	27 Sep 2012
Updated	27 Sep 2012
Type	Journal
Year	2012
Where	SIGMOD
Authors	YongChul Kwon, Magdalena Balazinska, Bill Howe, Jerome A. Rolia

Comments (0)

Sciweavers

SkewTune: mitigating skew in mapreduce applications

Database | Database Management Systems | Key Challenges | Parallel Databases | SIGMOD 2012 |

Explore & Download

Productivity Tools

Sciweavers