Run-Time Optimization of Sparse Matrix-Vector Multiplication on SIMD Machines

15 years 5 months ago

Download cgi2.cs.rpi.edu

Sparse matrix-vector multiplication forms the heart of iterative linear solvers used widely in scientific computations (e.g., finite element methods). In such solvers, the matrix-vector product is computed repeatedly, often thousands of times, with updated values of the vector until convergence is achieved. In an SIMD architecture, each processor has to fetch the updated off-processor vector elements while computing its share of the product. In this paper, we report on run-time optimization of array distribution and offprocessor data fetching to reduce both the communication and computation time. The optimization is applied to a sparse matrix stored in a compressed sparse row-wise format. Actual runs on test matrices produced up to a 35 percent relative improvement over a block distribution with a naive multiplication algorithm while simulations over a wider range of processors indicate that up to a 60 percent improvement may be possible in some cases.

Louis H. Ziantz, Can C. Özturan, Boleslaw K.

Real-time Traffic

Distributed And Parallel Computing | Iterative Linear Solvers | Off-processor Vector Elements | PARLE 1994 | Sparse Matrix-vector Multiplication |

claim paper

Added	27 Aug 2010
Updated	27 Aug 2010
Type	Conference
Year	1994
Where	PARLE
Authors	Louis H. Ziantz, Can C. Özturan, Boleslaw K. Szymanski

Sciweavers

Run-Time Optimization of Sparse Matrix-Vector Multiplication on SIMD Machines

Distributed And Parallel Computing | Iterative Linear Solvers | Off-processor Vector Elements | PARLE 1994 | Sparse Matrix-vector Multiplication |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers