Hyracks: A flexible and extensible foundation for data-intensive computing

14 years 11 months ago

Download www.ics.uci.edu

Abstract—Hyracks is a new partitioned-parallel software platform designed to run data-intensive computations on large shared-nothing clusters of computers. Hyracks allows users to express a computation as a DAG of data operators and connectors. Operators operate on partitions of input data and produce partitions of output data, while connectors repartition operators’ outputs to make the newly produced partitions available at the consuming operators. We describe the Hyracks end user model, for authors of dataﬂow jobs, and the extension model for users who wish to augment Hyracks’ built-in library with new operator and/or connector types. We also describe our initial Hyracks implementation. Since Hyracks is in roughly the same space as the open source Hadoop platform, we compare Hyracks with Hadoop experimentally for several different kinds of use cases. The initial results demonstrate that Hyracks has signiﬁcant promise as a next-generation platform for dataintensive applicati...

Vinayak R. Borkar, Michael J. Carey, Raman Grover,

Real-time Traffic

Connectors Repartition Operators | Database | ICDE 2011 | Operator | Source Hadoop Platform |

claim paper

» Flexible Soft RealTime Processing in Middleware

» A Data Warehouse Environment for Storing and Analyzing Simulation Output Data

» Deformable FreeSpace Tilings for Kinetic Collision Detection

Post Info
More Details (n/a)

Added	21 Aug 2011
Updated	21 Aug 2011
Type	Journal
Year	2011
Where	ICDE
Authors	Vinayak R. Borkar, Michael J. Carey, Raman Grover, Nicola Onose, Rares Vernica

Comments (0)

Sciweavers

Hyracks: A flexible and extensible foundation for data-intensive computing

Connectors Repartition Operators | Database | ICDE 2011 | Operator | Source Hadoop Platform |

Explore & Download

Productivity Tools

Sciweavers