Sciweavers

923 search results - page 114 / 185
» Shared Memory Performance Profiling
Sort
View
IPPS
2006
IEEE
14 years 2 months ago
Coterminous locality and coterminous group data prefetching on chip-multiprocessors
Due to shared cache contentions and interconnect delays, data prefetching is more critical in alleviating penalties from increasing memory latencies and demands on Chip-Multiproce...
Xudong Shi, Zhen Yang, Jih-Kwon Peir, Lu Peng, Yen...
PDP
2010
IEEE
14 years 1 months ago
hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications
The increasing numbers of cores, shared caches and memory nodes within machines introduces a complex hardware topology. High-performance computing applications now have to carefull...
François Broquedis, Jérôme Cle...
IPPS
1997
IEEE
14 years 1 months ago
DPF: A Data Parallel Fortran Benchmark Suite
We present the Data Parallel Fortran (DPF) benchmark suite, a set of data parallel Fortran codes forevaluatingdata parallel compilers appropriatefor any target parallel architectu...
Y. Charlie Hu, S. Lennart Johnsson, Dimitris Kehag...
HPCA
2012
IEEE
12 years 4 months ago
Decoupled dynamic cache segmentation
The least recently used (LRU) replacement policy performs poorly in the last-level cache (LLC) because temporal locality of memory accesses is filtered by first and second level...
Samira Manabi Khan, Zhe Wang, Daniel A. Jimé...
CF
2006
ACM
14 years 17 days ago
An efficient cache design for scalable glueless shared-memory multiprocessors
Traditionally, cache coherence in large-scale shared-memory multiprocessors has been ensured by means of a distributed directory structure stored in main memory. In this way, the ...
Alberto Ros, Manuel E. Acacio, José M. Garc...