Abstract. We investigate the performance of two approaches for matrix inversion based on Gaussian (LU factorization) and Gauss-Jordan eliminations. The target architecture is a current general-purpose multicore processor connected to a graphics processor (GPU). Parallelism is extracted in both processors by linking sequential versions of the codes with multi-threaded implementations of BLAS. Our results on a system with two Intel QuadCore processors and a Tesla C1060 GPU illustrate the performance and scalability attained by the codes on this system. Key words: Matrix sign function, hybrid platforms, GPUs, multi-core processors, linear algebra, high performance computing.
Peter Benner, Pablo Ezzatti, Enrique S. Quintana-O