Sciweavers

403 search results - page 37 / 81
» On Using Incremental Profiling for the Performance Analysis ...
Sort
View
IPPS
2007
IEEE
14 years 3 months ago
Optimizing Inter-Nest Data Locality Using Loop Splitting and Reordering
With the increasing gap between processor speed and memory latency, the performance of data-dominated programs are becoming more reliant on fast data access, which can be improved...
Sofiane Naci
ASPLOS
2008
ACM
13 years 10 months ago
The mapping collector: virtual memory support for generational, parallel, and concurrent compaction
Parallel and concurrent garbage collectors are increasingly employed by managed runtime environments (MREs) to maintain scalability, as multi-core architectures and multi-threaded...
Michal Wegiel, Chandra Krintz
ICPP
2006
IEEE
14 years 2 months ago
Data Transfers between Processes in an SMP System: Performance Study and Application to MPI
— This paper focuses on the transfer of large data in SMP systems. Achieving good performance for intranode communication is critical for developing an efficient communication s...
Darius Buntinas, Guillaume Mercier, William Gropp
ANCS
2009
ACM
13 years 6 months ago
Design and performance analysis of a DRAM-based statistics counter array architecture
The problem of maintaining efficiently a large number (say millions) of statistics counters that need to be updated at very high speeds (e.g. 40 Gb/s) has received considerable re...
Haiquan (Chuck) Zhao, Hao Wang, Bill Lin, Jun (Jim...
PDS
1996
13 years 10 months ago
Towards a theory of shared data in distributed systems
We have developed a theory of sharing which captures the behaviour of programs with respect to shared data into the framework of process algebra. The core theory can describe prog...
Simon A. Dobson, Christopher P. Wadsworth