Solving dense linear systems on platforms with multiple hardware accelerators

16 years 7 months ago

Download www.cs.utexas.edu

In a previous paper we show how the FLAME methods and tools provide a solution to compute dense dense linear algebra operations on a multi-GPU platform with reasonable performance while requiring little programming effort. In this paper we generalize the approach for systems with multiple hardware accelerators, and incorporate software implementations of standard cache/memory coherence techniques from computer architecture to improve the performance. Our experimental evaluation on an NVIDIA Tesla S870 platform delivers a peak performance well over 400 GFLOPS.

Enrique S. Quintana-Ortí, Francisco D. Igua

Real-time Traffic

Dense Dense Linear | Multiple Hardware Accelerators | Parallel Computing | PPOPP 2009 | Tesla S870 Platform |

claim paper

» An Extension of the StarSs Programming Model for Platforms with Multiple GPUs

» Synergistic execution of stream programs on multicores with accelerators

» Sparse transformations and preconditioners for hierarchical 3D capacitance extraction with...

» A High Throughput FPGAbased Floating Point Conjugate Gradient Implementation

» Optimising Memory Bandwidth Use for MatrixVector Multiplication in Iterative Methods

» StarPU A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures

» Substrate resistance extraction with direct boundary element method

» Emulationbased transient thermal modeling of 2D3D systemsonchip with active cooling

Post Info
More Details (n/a)

Added	25 Nov 2009
Updated	25 Nov 2009
Type	Conference
Year	2009
Where	PPOPP
Authors	Enrique S. Quintana-Ortí, Francisco D. Igual, Gregorio Quintana-Ortí, Robert A. van de Geijn

Comments (0)

Sciweavers

Solving dense linear systems on platforms with multiple hardware accelerators

Dense Dense Linear | Multiple Hardware Accelerators | Parallel Computing | PPOPP 2009 | Tesla S870 Platform |

Explore & Download

Productivity Tools

Sciweavers