A Simple Synchronous Distributed-Memory Algorithm for the HPCC RandomAccess Benchmark

14 years 7 months ago

Download www.sandia.gov

The RandomAccess benchmark as deﬁned by the High Performance Computing Challenge (HPCC) tests the speed at which a machine can update the elements of a table spread across global system memory, as measured in billions (giga) of updates per second (GUPS). The parallel implementation provided by HPCC typically performs poorly on distributed-memory machines, due to updates requiring numerous small point-to-point messages between processors. We present an alternative algorithm which treats the collection of P processors as a hypercube, aggregating data so that larger messages are sent, and routing individual datums through dimensions of the hypercube to their destination processor. The algorithm’s computation (the GUP count) scales linearly with P while its communication overhead scales as log2(P), thus enabling better performance on large numbers of processors. The new algorithm achieves a GUPS rate of 19.98 on 8192 processors of San

Steven J. Plimpton, Ron Brightwell, Courtenay Vaug

Real-time Traffic

CLUSTER 2006 | Cluster Computing | Performance Computing Challenge | Processors | Small Point-to-point Messages |

claim paper

Post Info
More Details (n/a)

Added	10 Jun 2010
Updated	10 Jun 2010
Type	Conference
Year	2006
Where	CLUSTER
Authors	Steven J. Plimpton, Ron Brightwell, Courtenay Vaughan, Keith D. Underwood, Mike Davis

Comments (0)

Sciweavers

A Simple Synchronous Distributed-Memory Algorithm for the HPCC RandomAccess Benchmark

CLUSTER 2006 | Cluster Computing | Performance Computing Challenge | Processors | Small Point-to-point Messages |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers