

ParaTimer: a progress indicator for MapReduce DAGs

14 years 7 months ago
ParaTimer: a progress indicator for MapReduce DAGs
Time-oriented progress estimation for parallel queries is a challenging problem that has received only limited attention. In this paper, we present ParaTimer, a new type of timeremaining indicator for parallel queries. Several parallel data processing systems exist. ParaTimer targets environments where declarative queries are translated into ensembles of MapReduce jobs. ParaTimer builds on previous techniques and makes two key contributions. First, it estimates the progress of queries that translate into directed acyclic graphs of MapReduce jobs, where jobs on different paths can execute concurrently (unlike prior work that looked at sequences only). For such queries, we use a new type of critical-path-based progress-estimation approach. Second, ParaTimer handles a variety of real systems challenges such as failures and data skew. To handle unexpected changes in query execution times due to runtime condition changes, ParaTimer provides users with not only one but with a set of time-r...
Kristi Morton, Magdalena Balazinska, Dan Grossman
Added 18 Jul 2010
Updated 18 Jul 2010
Type Conference
Year 2010
Authors Kristi Morton, Magdalena Balazinska, Dan Grossman
Comments (0)