Stencil-based kernels constitute the core of many scientific applications on block-structured grids. Unfortunately, these codes achieve a low fraction of peak performance, due pr...
Shoaib Kamil, Kaushik Datta, Samuel Williams, Leon...
Abstract. In this paper we present an implicit dictionary with the working set property i.e. a dictionary supporting insert(e), delete(x) and predecessor(x) in O(log n) time and se...
We consider triply-nested loops of the type that occur in the standard Gaussian elimination algorithm, which we denote by GEP (or the Gaussian Elimination Paradigm). We present tw...
Several existing cache-oblivious dynamic dictionaries achieve O(logB N) (or slightly better O(logB N M )) memory transfers per operation, where N is the number of items stored, M ...
Several fast sequential algorithms have been proposed in the past to multiply sparse matrices. These algorithms do not explicitlyaddresstheimpactofcachingonperformance. We show th...