MapReduce is a popular framework for data-intensive distributed computing of batch jobs. To simplify fault tolerance, the output of each MapReduce task and job is materialized to ...
Tyson Condie, Neil Conway, Peter Alvaro, Joseph M....
The RDF data model is gaining importance for applications in computational biology, knowledge sharing, and social communities. Recent work on RDF engines has focused on scalable p...
There has been a significant amount of excitement and recent work on column-oriented database systems ("column-stores"). These database systems have been shown to perfor...
This paper introduces an efficient algorithm for Producing Early Results in Multi-join query plans (PermJoin, for short). While most previous research focuses only on the
case of ...
Justin J. Levandoski, Mohamed E. Khalefa, Mohamed ...
This paper describes a new method for providingtransparent fault tolerance for parallel applications on a network of workstations. We have designed our method in the context of sh...