Sciweavers

52 search results - page 6 / 11
» Quantifying Locality In The Memory Access Patterns of HPC Ap...
Sort
View
HIPEAC
2011
Springer
12 years 8 months ago
NoC-aware cache design for multithreaded execution on tiled chip multiprocessors
In chip multiprocessors (CMPs), data accesslatency dependson the memory hierarchy organization, the on-chip interconnect (NoC), and the running workload. Reducing data access late...
Ahmed Abousamra, Alex K. Jones, Rami G. Melhem
PARCO
2003
13 years 9 months ago
Cache Memory Behavior of Advanced PDE Solvers
Three different partial differential equation (PDE) solver kernels are analyzed in respect to cache memory performance on a simulated shared memory computer. The kernels implement...
Dan Wallin, Henrik Johansson, Sverker Holmgren
HPDC
1999
IEEE
14 years 21 days ago
Dodo: A User-level System for Exploiting Idle Memory in Workstation Clusters
In this paper, we present the design and implementation of Dodo, an e cient user-level system for harvesting idle memory in o -the-shelf clusters of workstations. Dodo enables dat...
Samir Koussih, Anurag Acharya, Sanjeev Setia
IPPS
2007
IEEE
14 years 2 months ago
Optimizing Inter-Nest Data Locality Using Loop Splitting and Reordering
With the increasing gap between processor speed and memory latency, the performance of data-dominated programs are becoming more reliant on fast data access, which can be improved...
Sofiane Naci
ISCA
1995
IEEE
93views Hardware» more  ISCA 1995»
13 years 12 months ago
Optimizing Memory System Performance for Communication in Parallel Computers
Communicationin aparallel systemfrequently involvesmoving data from the memory of one node to the memory of another; this is the standard communication model employedin message pa...
Thomas Stricker, Thomas R. Gross