Sciweavers

IPPS
2005
IEEE

A Hardware Acceleration Unit for MPI Queue Processing

14 years 5 months ago
A Hardware Acceleration Unit for MPI Queue Processing
With the heavy reliance of modern scientific applications upon the MPI Standard, it has become critical for the implementation of MPI to be as capable and as fast as possible. This has led some of the fastest modern networks to introduce the capability to offload aspects of MPI processing to an embedded processor on the network interface. With this important capability has come significant performance implications. Most notably, the time to process long queues of posted receives or unexpected messages is substantially longer on embedded processors. This paper presents an associative list matching structure to accelerate the processing of moderate length queues in MPI. Simulations are used to compare the performance of an embedded processor augmented with this capability to a baseline implementation. The proposed enhancement significantly reduces latency for moderate length queues while adding virtually no overhead for extremely short queues.
Keith D. Underwood, K. Scott Hemmert, Arun Rodrigu
Added 25 Jun 2010
Updated 25 Jun 2010
Type Conference
Year 2005
Where IPPS
Authors Keith D. Underwood, K. Scott Hemmert, Arun Rodrigues, Richard C. Murphy, Ron Brightwell
Comments (0)