Sciweavers

403 search results - page 59 / 81
» On Using Incremental Profiling for the Performance Analysis ...
Sort
View
IPPS
2006
IEEE
14 years 2 months ago
Coterminous locality and coterminous group data prefetching on chip-multiprocessors
Due to shared cache contentions and interconnect delays, data prefetching is more critical in alleviating penalties from increasing memory latencies and demands on Chip-Multiproce...
Xudong Shi, Zhen Yang, Jih-Kwon Peir, Lu Peng, Yen...
SC
2003
ACM
14 years 1 months ago
Identifying and Exploiting Spatial Regularity in Data Memory References
The growing processor/memory performance gap causes the performance of many codes to be limited by memory accesses. If known to exist in an application, strided memory accesses fo...
Tushar Mohan, Bronis R. de Supinski, Sally A. McKe...
SP
2008
IEEE
138views Security Privacy» more  SP 2008»
13 years 8 months ago
A performance tuning methodology with compiler support
We have developed an environment, based upon robust, existing, open source software, for tuning applications written using MPI, OpenMP or both. The goal of this effort, which inte...
Oscar Hernandez, Barbara M. Chapman, Haoqiang Jin
ICPP
2006
IEEE
14 years 2 months ago
Parallel Algorithms for Evaluating Centrality Indices in Real-world Networks
This paper discusses fast parallel algorithms for evaluating several centrality indices frequently used in complex network analysis. These algorithms have been optimized to exploi...
David A. Bader, Kamesh Madduri
JAVA
2001
Springer
14 years 1 months ago
Runtime optimizations for a Java DSM implementation
Jackal is a fine-grained distributed shared memory implementation of the Java programming language. Jackal implements Java’s memory model and allows multithreaded Java programs...
Ronald Veldema, Rutger F. H. Hofman, Raoul Bhoedja...