Numerical linear algebra operations are key primitives in scientific computing. Performance optimizations of such operations have been extensively investigated. With the rapid advances in technology, hardware acceleration of linear algebra applications using field-programmable gate arrays (FPGAs) has become feasible. In this paper, we propose FPGA-based designs for several basic linear algebra operations, including dot product, matrix-vector multiplication, matrix multiplication, and matrix factorization. By identifying the parameters for each operation, we analyze the trade-offs and propose a high-performance design. In the implementations of the designs, the values of the parameters are determined according to the hardware constraints, such as the available chip area, the size of available memory, the memory bandwidth, and the number of I/O pins. The proposed designs are implemented on Xilinx Virtex-II Pro FPGAs. Experimental results show that our designs scale with the available har...
Ling Zhuo, Viktor K. Prasanna