Concurrent Number Cruncher: An Efficient Sparse Linear Solver on the GPU

15 years 10 months ago

Download www2.cs.uh.edu

A wide class of geometry processing and PDE resolution methods needs to solve a linear system, where the non-zero pattern of the matrix is dictated by the connectivity matrix of the mesh. The advent of GPUs with their ever-growing amount of parallel horsepower makes them a tempting resource for such numerical computations. This can be helped by new APIs (CTM from ATI and CUDA from NVIDIA) which give a direct access to the multithreaded computational resources and associated memory bandwidth of GPUs; CUDA even provides a BLAS implementation but only for dense matrices (CuBLAS). However, existing GPU linear solvers are restricted to specific types of matrices, or use non-optimal compressed row storage strategies. By combining recent GPU programming techniques with supercomputing strategies (namely block compressed row storage and register blocking), we implement a sparse general-purpose linear solver which outperforms leading-edge CPU counterparts (MKL / ACML).

Luc Buatois, Guillaume Caumon, Bruno Lévy

Real-time Traffic

Distributed And Parallel Computing | GPU Linear Solvers | HPCC 2007 | Linear Solver | Row Storage |

claim paper

Post Info
More Details (n/a)

Added	16 Aug 2010
Updated	16 Aug 2010
Type	Conference
Year	2007
Where	HPCC
Authors	Luc Buatois, Guillaume Caumon, Bruno Lévy

Comments (0)

Sciweavers

Concurrent Number Cruncher: An Efficient Sparse Linear Solver on the GPU

Distributed And Parallel Computing | GPU Linear Solvers | HPCC 2007 | Linear Solver | Row Storage |

Explore & Download

Productivity Tools

Sciweavers