Sciweavers

42 search results - page 5 / 9
» Memory Requirement Optimization with Loop Fusion and Loop Sh...
Sort
View
SC
2003
ACM
14 years 21 days ago
Identifying and Exploiting Spatial Regularity in Data Memory References
The growing processor/memory performance gap causes the performance of many codes to be limited by memory accesses. If known to exist in an application, strided memory accesses fo...
Tushar Mohan, Bronis R. de Supinski, Sally A. McKe...
CF
2005
ACM
13 years 9 months ago
A case for a working-set-based memory hierarchy
Modern microprocessor designs continue to obtain impressive performance gains through increasing clock rates and advances in the parallelism obtained via micro-architecture design...
Steve Carr, Soner Önder
PLDI
2009
ACM
14 years 2 months ago
Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory
Multicore designs have emerged as the mainstream design paradigm for the microprocessor industry. Unfortunately, providing multiple cores does not directly translate into performa...
Mojtaba Mehrara, Jeff Hao, Po-Chun Hsu, Scott A. M...
PLDI
2005
ACM
14 years 1 months ago
Demystifying on-the-fly spill code
Modulo scheduling is an effective code generation technique that exploits the parallelism in program loops by overlapping iterations. One drawback of this optimization is that reg...
Alex Aletà, Josep M. Codina, Antonio Gonz&a...
IEEEPACT
1999
IEEE
13 years 11 months ago
Localizing Non-Affine Array References
Existing techniques can enhance the locality of arrays indexed by affine functions of induction variables. This paper presents a technique to localize non-affine array references,...
Nicholas Mitchell, Larry Carter, Jeanne Ferrante