Graphics processing units (GPUs) provide both memory bandwidth and arithmetic performance far greater than that available on CPUs but, because of their Single-Instruction-Multiple...
The Standard Template Adaptive Parallel Library (stapl) is a parallel programming framework that extends C++ and stl with support for parallelism. stapl provides a collection of pa...
Gabriel Tanase, Chidambareswaran Raman, Mauro Bian...
The novel design of an efficient FPGA interconnection architecture with multiple Switch Boxes (SB) and hardwired connections for realizing data intensive applications (i.e. DSP ap...
This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms PBLAS. The PBLAS are targeted at distributed vector-vector, matrix-vector and matrixmatrix...
Jaeyoung Choi, Jack Dongarra, Susan Ostrouchov, An...
This paper discusses the implementation of a numerical algorithm for simulating incompressible fluid flows based on the finite difference method and designed for parallel compu...