In this paper, we propose and implement a vector processing system that includes two identical vector microprocessors embedded in two FPGA chips. Each vector microprocessor supports floating-point calculations and efficient sparse matrix operations. Dense matrix-matrix multiplication and sparse matrix-vector multiplication with benchmark matrices from various application domains were run on the system to evaluate its performance. The resulting calculation times are compared with those of a commercial PC to show the effectiveness of our approach.
Hongyan Yang, Shuai Wang, Sotirios G. Ziavras, Jie