We present a fast and scalable matrix multiplication algorithm on distributed memory concurrent computers, whose performance is independent of data distribution on processors, and...
We report our progress on SSOCK, a scalable highperformance communication library for wide-area environments. SSOCK has an API similar to that of the Socket library, but solves th...
- Parallel architectures have become an increasingly popular method in which to achieve high performance with low power consumption. In order to leverage these benefits, applicatio...
The Cray MTA-2 system provides exceptional performance on a variety of sparse graph algorithms. Unfortunately, it was an extremely expensive platform. Cray is preparing an Eldorad...
Keith D. Underwood, Megan Vance, Jonathan W. Berry...
We show empirically that some of the issues that affected the design of linear algebra libraries for distributed memory architectures will also likely affect such libraries for s...
Bryan Marker, Field G. Van Zee, Kazushige Goto, Gr...