Sciweavers

328 search results - page 52 / 66
» Static performance prediction of skeletal parallel programs
Sort
View
IEEEPACT
2006
IEEE
14 years 1 months ago
Whole-program optimization of global variable layout
On machines with high-performance processors, the memory system continues to be a performance bottleneck. Compilers insert prefetch operations and reorder data accesses to improve...
Nathaniel McIntosh, Sandya Mannarswamy, Robert Hun...
IEEEPACT
1998
IEEE
13 years 11 months ago
Adaptive Scheduling of Computations and Communications on Distributed Memory Systems
Compile-time scheduling is one approach to extract parallelism which has proved effective when the execution behavior is predictable. Unfortunately, the performance of most priori...
Mayez A. Al-Mouhamed, Homam Najjari
IEEEPACT
2006
IEEE
14 years 1 months ago
Adaptive reorder buffers for SMT processors
In SMT processors, the complex interplay between private and shared datapath resources needs to be considered in order to realize the full performance potential. In this paper, we...
Joseph J. Sharkey, Deniz Balkan, Dmitry Ponomarev
ASPLOS
2008
ACM
13 years 9 months ago
Efficiency trends and limits from comprehensive microarchitectural adaptivity
Increasing demand for power-efficient, high-performance computing requires tuning applications and/or the underlying hardware to improve the mapping between workload heterogeneity...
Benjamin C. Lee, David Brooks
IEEEPACT
2002
IEEE
14 years 10 days ago
Increasing and Detecting Memory Address Congruence
A static memory reference exhibits a unique property when its dynamic memory addresses are congruent with respect to some non-trivial modulus. Extraction of this congruence inform...
Samuel Larsen, Emmett Witchel, Saman P. Amarasingh...