Sciweavers

51 search results - page 1 / 11
» Handling Data Skew in MapReduce
Sort
View
CLOSER
2011
42views more  CLOSER 2011»
12 years 8 months ago
Handling Data Skew in MapReduce
Benjamin Gufler, Nikolaus Augsten, Angelika Reiser...
ICDE
2012
IEEE
216views Database» more  ICDE 2012»
11 years 11 months ago
Load Balancing in MapReduce Based on Scalable Cardinality Estimates
—MapReduce has emerged as a popular tool for distributed and scalable processing of massive data sets and is increasingly being used in e-science applications. Unfortunately, the...
Benjamin Gufler, Nikolaus Augsten, Angelika Reiser...
SIGMOD
2010
ACM
214views Database» more  SIGMOD 2010»
14 years 1 months ago
ParaTimer: a progress indicator for MapReduce DAGs
Time-oriented progress estimation for parallel queries is a challenging problem that has received only limited attention. In this paper, we present ParaTimer, a new type of timere...
Kristi Morton, Magdalena Balazinska, Dan Grossman
CLOUDCOM
2010
Springer
13 years 5 months ago
LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud
This paper investigates the problem of Partitioning Skew1 in MapReduce-based system. Our studies with Hadoop, a widely used MapReduce implementation, demonstrate that the presence ...
Shadi Ibrahim, Hai Jin, Lu Lu, Song Wu, Bingsheng ...
CIKM
2011
Springer
12 years 8 months ago
Block-based load balancing for entity resolution with MapReduce
The effectiveness and scalability of MapReduce-based implementations of complex data-intensive tasks depend on an even redistribution of data between map and reduce tasks. In the...
Lars Kolb, Andreas Thor, Erhard Rahm