Regular distributions for storing dense matrices on parallel systems are not always used in practice. In many scientific applicati RUMMA) [1] to handle irregularly distributed mat...
This paper presents a new partitioning algorithm to perform matrix multiplication on two interconnected heterogeneous processors. Data is partitioned in a way which minimizes the ...
Floating-point Sparse Matrix-Vector Multiplication (SpMXV) is a key computational kernel in scientific and engineering applications. The poor data locality of sparse matrices sig...
Finding latent factors of the data using matrix factorizations is a tried-and-tested approach in data mining. But finding shared factors over multiple matrices is more novel prob...
This paper presents some improvements on the optimization of hardware multiplication by constant matrices. We focus on the automatic generation of circuits that involve constant m...