Sciweavers

436 search results - page 6 / 88
» Performance Optimization and Modeling of Blocked Sparse Kern...
Sort
View
LCPC
2005
Springer
14 years 1 months ago
Applying Data Copy to Improve Memory Performance of General Array Computations
Abstract. Data copy is an important compiler optimization which dynamically rearranges the layout of arrays by copying their elements into local buffers. Traditionally, array copy...
Qing Yi
HPCC
2005
Springer
14 years 1 months ago
Fast Sparse Matrix-Vector Multiplication by Exploiting Variable Block Structure
Abstract. We improve the performance of sparse matrix-vector multiplication (SpMV) on modern cache-based superscalar machines when the matrix structure consists of multiple, irregu...
Richard W. Vuduc, Hyun-Jin Moon
CLUSTER
2011
IEEE
12 years 7 months ago
Performance Characterization and Optimization of Atomic Operations on AMD GPUs
—Atomic operations are important building blocks in supporting general-purpose computing on graphics processing units (GPUs). For instance, they can be used to coordinate executi...
Marwa Elteir, Heshan Lin, Wu-chun Feng
PAMI
2012
11 years 10 months ago
Face Recognition Using Sparse Approximated Nearest Points between Image Sets
—We propose an efficient and robust solution for image set classification. A joint representation of an image set is proposed which includes the image samples of the set and thei...
Yiqun Hu, Ajmal S. Mian, Robyn A. Owens
ASPLOS
1991
ACM
13 years 11 months ago
The Cache Performance and Optimizations of Blocked Algorithms
Blocking is a well-known optimization technique for improving the effectiveness of memory hierarchies. Instead of operating on entire rows or columns of an array, blocked algorith...
Monica S. Lam, Edward E. Rothberg, Michael E. Wolf