Modern reconfigurable computing systems feature powerful hybrid architectures with multiple microprocessor cores, large reconfigurable logic arrays and distributed memory hierarch...
Modern commodity hardware architectures, with their multiple multi-core CPUs and high-speed system interconnects, exhibit tremendous power. In this paper, we study performance lim...
Norbert Egi, Adam Greenhalgh, Mark Handley, Micka&...
The objective of this paper is to extend, in the context of multicore architectures, the concepts of tile algorithms [Buttari et al., 2007] for Cholesky, LU, QR factorizations to t...
We demonstrate Spiral, a domain-specific library generation system. Spiral generates high performance source code for linear transforms (such as the discrete Fourier transform and ...
Abstract. Sparse matrix-vector multiplication is an important computational kernel that tends to perform poorly on modern processors, largely because of its high ratio of memory op...