This paper presents reduction recognition and parallel code generationstrategies for distributed-memorymultiprocessors. We describe techniques to recognize a broad range of implic...
Sparse LU factorization with partial pivoting is important for many scienti c applications and delivering high performance for this problem is di cult on distributed memory machin...
Multiple programming models are emerging to address an increased need for dynamic task parallelism in applications for multicore processors and shared-address-space parallel compu...
Yi Guo, Rajkishore Barik, Raghavan Raman, Vivek Sa...
Parallel algorithm designers need computational models that take first order system costs into account, but are also simple enough to use in practice. This paper introduces the L...
The traditional approach to the parallelization of linear algebra algorithms such as matrix multiplication and LU factorization calls for static allocation of matrix blocks to proc...
Marc Mazzariol, Benoit A. Gennart, Vincent Messerl...