In this paper, we study the problem of optimal matrix partitioning for parallel dense factorization on heterogeneous processors. First, we outline existing algorithms solving the ...
The MPI-2 Standard has carefully specified the interaction between MPI and usercreated threads. The goal of this specification is to allow users to write multithreaded MPI progr...
Abstract. In this paper, we focus on MPI collective algorithm selection process and explore the applicability of the quadtree encoding method to this problem. During the algorithm ...
Jelena Pjesivac-Grbovic, George Bosilca, Graham E....
The fat-tree is one of the topologies most widely used to build high-performance parallel computers. However, they are expensive and difficult to build. In this paper we propose t...
The graphics processing unit (GPU) is used to solve large linear systems derived from partial differential equations. The differential equations studied are strongly convection-...
Joseph M. Elble, Nikolaos V. Sahinidis, Panagiotis...