Sciweavers

772 search results - page 110 / 155
» Hyper-Systolic Matrix Multiplication
Sort
View
HIPC
2005
Springer
14 years 2 months ago
Performance Study of LU Decomposition on the Programmable GPU
With the increasing programmability of GPUs (graphics processing units), these units are emerging as an attractive computing platform not only for traditional graphics computation ...
Fumihiko Ino, Manabu Matsui, Keigo Goda, Kenichi H...
ICS
2005
Tsinghua U.
14 years 2 months ago
Think globally, search locally
A key step in program optimization is the determination of optimal values for code optimization parameters such as cache tile sizes and loop unrolling factors. One approach, which...
Kamen Yotov, Keshav Pingali, Paul Stodghill
PCI
2005
Springer
14 years 2 months ago
Tuning Blocked Array Layouts to Exploit Memory Hierarchy in SMT Architectures
Cache misses form a major bottleneck for memory-intensive applications, due to the significant latency of main memory accesses. Loop tiling, in conjunction with other program tran...
Evangelia Athanasaki, Kornilios Kourtis, Nikos Ana...
PCM
2005
Springer
209views Multimedia» more  PCM 2005»
14 years 2 months ago
Virtual Object Placement in Video for Augmented Reality
This article describes a method to insert virtual objects into a real video stream based on feature tracking and camera pose estimation from a set of single-camera video frames. To...
Jong Seung Park, Mee Young Sung, Sung-Ryul Noh
PPAM
2005
Springer
14 years 2 months ago
A New Diagonal Blocking Format and Model of Cache Behavior for Sparse Matrices
Algorithms for the sparse matrix-vector multiplication (shortly SpM×V ) are important building blocks in solvers of sparse systems of linear equations. Due to matrix sparsity, the...
Pavel Tvrdík, Ivan Simecek