Sciweavers

403 search results - page 39 / 81
» On Using Incremental Profiling for the Performance Analysis ...
Sort
View
ICS
2005
Tsinghua U.
14 years 2 months ago
Lightweight reference affinity analysis
Previous studies have shown that array regrouping and structure splitting significantly improve data locality. The most effective technique relies on profiling every access to eve...
Xipeng Shen, Yaoqing Gao, Chen Ding, Roch Archamba...
PPOPP
2009
ACM
14 years 9 months ago
A compiler-directed data prefetching scheme for chip multiprocessors
Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...
ICCS
2005
Springer
14 years 2 months ago
Performance and Scalability Analysis of Cray X1 Vectorization and Multistreaming Optimization
Cray X1 Fortran and C/C++ compilers provide a number of loop transformations, notably vectorization and multistreaming, in order to exploit the multistreaming processor (MSP) hard...
Sadaf R. Alam, Jeffrey S. Vetter
PDP
2010
IEEE
14 years 1 months ago
hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications
The increasing numbers of cores, shared caches and memory nodes within machines introduces a complex hardware topology. High-performance computing applications now have to carefull...
François Broquedis, Jérôme Cle...
ASPLOS
2010
ACM
14 years 3 months ago
Addressing shared resource contention in multicore processors via scheduling
Contention for shared resources on multicore processors remains an unsolved problem in existing systems despite significant research efforts dedicated to this problem in the past...
Sergey Zhuravlev, Sergey Blagodurov, Alexandra Fed...