— As network infrastructures with 10 Gb/s bandwidth and beyond have become pervasive and as cost advantages of large commodity-machine clusters continue to increase, research and industry strive to exploit the available processing performance for large-scale database processing tasks. In this work we look at the use of high-speed networks for distributed join processing. We propose Data Roundabout as a lightweight transport layer that uses Remote Direct Memory Access (RDMA) to gain access to the throughput opportunities in modern networks. The essence of Data Roundabout is a ringshaped network in which each host stores one portion of a large database instance. We leverage the available bandwidth to (continuously) pump data through the high-speed network. Based on Data Roundabout, we demonstrate cyclo-join, which exploits the cycling flow of data to execute distributed joins. The study uses different join algorithms (hash join and sort-merge join) to expose the pitfalls and the advan...
Philip Werner Frey, Romulo Goncalves, Martin L. Ke