The performance of both serial and parallel implementations of matrix multiplication is highly sensitive to memory system behavior. False sharing and cache conflicts cause traditi...
Siddhartha Chatterjee, Alvin R. Lebeck, Praveen K....
Matrix multiplication is an important kernel in linear algebra algorithms, and the performance of both serial and parallel implementations is highly dependent on the memory system...
Siddhartha Chatterjee, Alvin R. Lebeck, Praveen K....
We perform forward error analysis for a large class of recursive matrix multiplication algorithms in the spirit of [D. Bini and G. Lotti, Stability of fast algorithms for matrix m...
James Demmel, Ioana Dumitriu, Olga Holtz, Robert K...
This paper investigates the placement of data and parity on redundant disk arrays. Declustered organizations have been traditionally used to achieve fast reconstruction of a faile...
Guillermo A. Alvarez, Walter A. Burkhard, Larry J....
The known fast sequential algorithms for multiplying two N N matrices (over an arbitrary ring) have time complexity ON , where 2 3. The current best value of is less than 2.3755....