Sciweavers

CONCURRENCY
1998

A new parallel matrix multiplication algorithm on distributed-memory concurrent computers

13 years 11 months ago
A new parallel matrix multiplication algorithm on distributed-memory concurrent computers
We present a new fast and scalable matrix multiplication algorithm, called DIMMA Distribution-Independent Matrix Multiplication Algorithm, for block cyclic data distribution on distributed-memory concurrentcomputers. The algorithm is based on two new ideas; it uses a modi ed pipelined communication scheme to overlap computation and communication e ectively, and exploits the LCM block concept to obtain the maximum performance of the sequential BLAS routine in each processor even when the block size is very small as well as very large. The algorithm is implemented and compared with SUMMA on the Intel Paragon computer.
Jaeyoung Choi
Added 22 Dec 2010
Updated 22 Dec 2010
Type Journal
Year 1998
Where CONCURRENCY
Authors Jaeyoung Choi
Comments (0)