We consider a generalization of the codes defined by norm and trace functions on finite fields introduced by Olav Geil. The codes in the new family still satisfy Geil’s dualit...
We present a simulation-based performance model to analyze a parallel sparse LU factorization algorithm on modern cached-based, high-end parallel architectures. We consider supern...
The split-radix FFT computes a size-n complex DFT, when n is a large power of 2, using just 4n lg n−6n+8 arithmetic operations on real numbers. This operation count was first an...
Let Fn be the binary n-cube, or binary Hamming space of dimension n, endowed with the Hamming distance, and En (respectively, On ) the set of vectors with even (respectively, odd)...
Abstract. We present new performance models and a new, more compact data structure for cache blocking when applied to the sparse matrixvector multiply (SpM×V) operation, y ← y +...
Rajesh Nishtala, Richard W. Vuduc, James Demmel, K...