Sciweavers

61 search results - page 7 / 13
» Evaluation of Dynamic Data Distributions on NUMA Shared Memo...
Sort
View
PPOPP
2009
ACM
14 years 8 months ago
A compiler-directed data prefetching scheme for chip multiprocessors
Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...
IPPS
2009
IEEE
14 years 2 months ago
Using hardware transactional memory for data race detection
Abstract—Widespread emergence of multicore processors will spur development of parallel applications, exposing programmers to degrees of hardware concurrency hitherto unavailable...
Shantanu Gupta, Florin Sultan, Srihari Cadambi, Fr...
HPCA
2007
IEEE
14 years 1 months ago
An Adaptive Cache Coherence Protocol Optimized for Producer-Consumer Sharing
Shared memory multiprocessors play an increasingly important role in enterprise and scientific computing facilities. Remote misses limit the performance of shared memory applicat...
Liqun Cheng, John B. Carter, Donglai Dai
IPPS
1999
IEEE
13 years 11 months ago
A Graph Based Framework to Detect Optimal Memory Layouts for Improving Data Locality
In order to extract high levels of performance from modern parallel architectures, the effective management of deep memory hierarchies is very important. While architectural advan...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
CLUSTER
2009
IEEE
13 years 11 months ago
Analytical modeling and optimization for affinity based thread scheduling on multicore systems
Abstract--This paper proposes an analytical model to estimate the cost of running an affinity-based thread schedule on multicore systems. The model consists of three submodels to e...
Fengguang Song, Shirley Moore, Jack Dongarra