High-Precision BLAS on FPGA-enhanced Computers

14 years 1 months ago

Download www.umac.mo

The emergence of high-density reconfigurable hardware devices gives scientists and engineers an option to accelerating their numerical computing applications on low-cost but powerful “FPGA-enhanced computers”. In this paper, we introduced our efforts towards improving the computational performance of Basic Linear Algebra Subprograms (BLAS) by FPGA-specific algorithms/methods. Our study focus on three BLAS subroutines: floating point summation, matrix-vector multiplication, and matrix-matrix multiplication. They represent all three levels of BLAS functionalities, and their sustained computational performances are either memory bandwidth bounded or computation bounded. By proposing the group-alignment based floating-point summation method and applying this technique to other subroutines, we significantly improved their sustained computational performance and reduced numerical errors with moderate FPGA resources consumed. Comparing with existing FPGA-based implementations, our design...

Chuan He, Guan Qin, Richard E. Ewing, Wei Zhao

Real-time Traffic

Basic Linear Algebra Subprograms | BLAS Subroutines | Computational Performance | ERSA 2007 | Hardware |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2007
Where	ERSA
Authors	Chuan He, Guan Qin, Richard E. Ewing, Wei Zhao

Comments (0)

Sciweavers

High-Precision BLAS on FPGA-enhanced Computers

Basic Linear Algebra Subprograms | BLAS Subroutines | Computational Performance | ERSA 2007 | Hardware |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers