A High Throughput FPGA-Based Implementation of the Lanczos Method for the Symmetric Extremal Eigenvalue Problem

13 years 10 months ago

Download cas.ee.ic.ac.uk

Iterative numerical algorithms with high memory bandwidth requirements but medium-size data sets (matrix size ∼ a few 100s) are highly appropriate for FPGA acceleration. This paper presents a streaming architecture comprising ﬂoating-point operators coupled with highbandwidth on-chip memories for the Lanczos method, an iterative algorithm for symmetric eigenvalues computation. We show the Lanczos method can be specialized only for extremal eigenvalues computation and present an architecture which can achieve a sustained single precision ﬂoating-point performance of 175 GFLOPs on Virtex6-SX475T for a dense matrix of size 335×335. We perform a quantitative comparison with the parallel implementations of the Lanczos method using optimized Intel MKL and CUBLAS libraries for multi-core and GPU respectively. We ﬁnd that for a range of matrices the FPGA implementation outperforms both multi-core and GPU; a speed up of 8.2-27.3× (13.4× geo. mean) over an Intel Xeon X5650 and 26.2-11...

Abid Rafique, Nachiket Kapre, George A. Constantin

Real-time Traffic

ARC 2012 | Hardware | Intel Xeon | Iterative Algorithm | Parallel Implementations |

claim paper

Post Info
More Details (n/a)

Added	19 Apr 2012
Updated	19 Apr 2012
Type	Journal
Year	2012
Where	ARC
Authors	Abid Rafique, Nachiket Kapre, George A. Constantinides

Comments (0)

Sciweavers

A High Throughput FPGA-Based Implementation of the Lanczos Method for the Symmetric Extremal Eigenvalue Problem

ARC 2012 | Hardware | Intel Xeon | Iterative Algorithm | Parallel Implementations |

Explore & Download

Productivity Tools

Sciweavers