Map-reduce framework has received a significant attention and is being used for programming both large-scale clusters and multi-core systems. While the high productivity aspect of map-reduce has been well accepted, it is not clear if the API results in efficient implementations for different subclasses of data-intensive applications. In this paper, we present a system MATE (Map-reduce with an AlternaTE API), that provides a high-level, but distinct API. Particularly, our API includes a programmer-managed reduction object, which results in lower memory requirements at runtime for many dataintensive applications. MATE implements this API on top of the Phoenix system, a multi-core map-reduce implementation from Stanford. We evaluate our system using three data mining applications and compare its performance to that of both Phoenix and Hadoop. Our results show that for all the three applications, MATE outperforms Phoenix and Hadoop. Despite achieving good scalability, MATE also maintains t...
Wei Jiang, Vignesh T. Ravi, Gagan Agrawal