Sciweavers

955 search results - page 11 / 191
» Performance optimization of multiple memory architectures fo...
Sort
View
PLDI
1995
ACM
13 years 11 months ago
Storage Assignment to Decrease Code Size
DSP architectures typically provide indirect addressing modes with auto-increment and decrement. In addition, indexing mode is not available, and there are usually few, if any, ge...
Stan Y. Liao, Srinivas Devadas, Kurt Keutzer, Stev...
ICS
2009
Tsinghua U.
14 years 2 months ago
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
Jiayuan Meng, Kevin Skadron
JPDC
2008
108views more  JPDC 2008»
13 years 7 months ago
Energy minimization with loop fusion and multi-functional-unit scheduling for multidimensional DSP
Energy saving is becoming one of the major design issues in processor architectures with multiple functional units (FUs). Nested loops are usually the most critical part in multim...
Meikang Qiu, Edwin Hsing-Mean Sha, Meilin Liu, Man...
WMPI
2004
ACM
14 years 1 months ago
Scalable cache memory design for large-scale SMT architectures
The cache hierarchy design in existing SMT and superscalar processors is optimized for latency, but not for bandwidth. The size of the L1 data cache did not scale over the past dec...
Muhamed F. Mudawar