Sciweavers

71 search results - page 10 / 15
» Improving memory bank-level parallelism in the presence of p...
Sort
View
CASES
2001
ACM
13 years 11 months ago
Combined partitioning and data padding for scheduling multiple loop nests
With the widening performance gap between processors and main memory, efficient memory accessing behavior is necessary for good program performance. Loop partition is an effective...
Zhong Wang, Edwin Hsing-Mean Sha, Xiaobo Hu
MICRO
2000
IEEE
176views Hardware» more  MICRO 2000»
13 years 7 months ago
An Advanced Optimizer for the IA-64 Architecture
level of abstraction, compared with the program representation for scalar optimizations. For example, loop unrolling and loop unrolland-jam transformations exploit the large regist...
Rakesh Krishnaiyer, Dattatraya Kulkarni, Daniel M....
SC
2003
ACM
14 years 19 days ago
Identifying and Exploiting Spatial Regularity in Data Memory References
The growing processor/memory performance gap causes the performance of many codes to be limited by memory accesses. If known to exist in an application, strided memory accesses fo...
Tushar Mohan, Bronis R. de Supinski, Sally A. McKe...
ICPP
1999
IEEE
13 years 11 months ago
Coherence-Centric Logging and Recovery for Home-Based Software Distributed Shared Memory
The probability of failures in software distributed shared memory (SDSM) increases as the system size grows. This paper introduces a new, efficient message logging technique, call...
Angkul Kongmunvattana, Nian-Feng Tzeng
ISCA
1994
IEEE
117views Hardware» more  ISCA 1994»
13 years 11 months ago
Evaluating Stream Buffers as a Secondary Cache Replacement
Today's commodity microprocessors require a low latency memory system to achieve high sustained performance. The conventional high-performance memory system provides fast dat...
Subbarao Palacharla, Richard E. Kessler