Abstract. We present new performance models and a new, more compact data structure for cache blocking when applied to the sparse matrixvector multiply (SpM×V) operation, y ← y +...
Rajesh Nishtala, Richard W. Vuduc, James Demmel, K...
In this paper, we propose and implement a vector processing system that includes two identical vector microprocessors embedded in two FPGA chips. Each vector microprocessor suppor...
Hongyan Yang, Shuai Wang, Sotirios G. Ziavras, Jie...
The displacement structure is extended to a Kronecker matrix W Z. A new class of Kronecker-like matrices with the displacement rank r, r < n will be formulated and presented. ...
Abstract. Hierarchical (H)-matrices approximate full or sparse matrices using a hierarchical data sparse format. The corresponding H-matrix arithmetic reduces the time complexity o...
We prove that for any real-valued matrix X ∈ Rm×n , and positive integers r k, there is a subset of r columns of X such that projecting X onto their span gives a r+1 r−k+1 -a...