Abstract. We investigate the performance of two approaches for matrix inversion based on Gaussian (LU factorization) and Gauss-Jordan eliminations. The target architecture is a cur...
Peter Benner, Pablo Ezzatti, Enrique S. Quintana-O...
In this paper an optimized k-means implementation on the graphics processing unit (GPU) is presented. NVIDIA’s Compute Unified Device Architecture (CUDA), available from the G8...
This paper presents a novel approach for the problem of generating tiled code for nested for-loops using a tiling transformation. Tiling or supernode transformation has been widel...
Georgios I. Goumas, Maria Athanasaki, Nectarios Ko...
Heterogeneous computing combines general purpose CPUs with accelerators to efficiently execute both sequential control-intensive and data-parallel phases of applications. Existin...
Isaac Gelado, Javier Cabezas, Nacho Navarro, John ...
—Clusters and applications continue to grow in size while their mean time between failure (MTBF) is getting smaller. Checkpoint/Restart is becoming increasingly important for lar...