Map-reduce-merge: simplified relational data processing on large clusters

16 years 6 months ago

Download delivery.acm.org

Map-Reduce is a programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. Through a simple interface with two functions, map and reduce, this model facilitates parallel implementation of many real-world tasks such as data processing for search engines and machine learning. However, this model does not directly support processing multiple related heterogeneous datasets. While processing relational data is a common need, this limitation causes difficulties and/or inefficiency when Map-Reduce is applied on relational operations like joins. We improve Map-Reduce into a new model called MapReduce-Merge. It adds to Map-Reduce a Merge phase that can efficiently merge data already partitioned and sorted (or hashed) by map and reduce modules. We also demonstrate that this new model can express relational algebra operators as well as implement several join algorithms. Categories and Subject Descript...

Hung-chih Yang, Ali Dasdan, Ruey-Lung Hsiao, Dougl

Real-time Traffic

Database | Model Called Mapreduce-merge | Relational Algebra Operators | Relational Data | SIGMOD 2007 |

claim paper

» LargeScale Discovery of Spatially Related Images

» Simplifying the Clickstream Retrieval Using Weblogger Tool

» Interactive exploration of very large relational datasets through 3D dynamic projections

» Boosting for ModelBased Data Clustering

» Finding Semantically Related Words in Large Corpora

» Integrated Document Browsing and Data Acquisition for Building Large Ontologies

» A Webbased and Gridenabled dChip version for the analysis of large sets of gene expression...

» Mining Processes using cluster approach for representing workflows

Post Info
More Details (n/a)

Added	08 Dec 2009
Updated	08 Dec 2009
Type	Conference
Year	2007
Where	SIGMOD
Authors	Hung-chih Yang, Ali Dasdan, Ruey-Lung Hsiao, Douglas Stott Parker Jr.

Comments (0)

Sciweavers

Map-reduce-merge: simplified relational data processing on large clusters

Database | Model Called Mapreduce-merge | Relational Algebra Operators | Relational Data | SIGMOD 2007 |

Explore & Download

Productivity Tools

Sciweavers