Within the parallel computing domain, field programmable gate arrays (FPGA) are no longer restricted to their traditional role as substitutes for application-specific integrated circuits–as hardware “hidden” from the end user. Several high performance computing vendors offer parallel reconfigurable computers employing user-programmable FPGAs. These exciting new architectures allow end-users to, in effect, create reconfigurable coprocessors targeting the computationally intensive parts of each problem. The increased capability of contemporary FPGAs coupled with the embarrassingly parallel nature of the Jacobi iterative method make the Jacobi method an ideal candidate for hardware acceleration. This paper introduces a parameterized design for a deeply pipelined, highly parallelized IEEE 64-bit floating-point version of the Jacobi method. A Jacobi circuit is implemented using a Xilinx Virtex-II Pro as the target FPGA device. Implementation statistics and performance estimates ...
Gerald R. Morris, Viktor K. Prasanna