Sciweavers

239 search results - page 1 / 48
» A Performance Evaluation of the Convex SPP-1000 Scalable Sha...
Sort
View
SC
1995
ACM
13 years 11 months ago
A Performance Evaluation of the Convex SPP-1000 Scalable Shared Memory Parallel Computer
The Convex SPP-1000 is the first commercial implementation of a new generation of scalable shared memory parallel computers with full cache coherence. It employs a hierarchical s...
Thomas L. Sterling, Daniel Savarese, Peter MacNeic...
ICPP
1996
IEEE
13 years 11 months ago
Scheduling of Wavefront Parallelism on Scalable Shared-memory Multiprocessors
Tiling exploits temporal reuse carried by an outer loop of a loop nest to enhance cache locality. Loop skewing is typically required to make tiling legal. This restricts parallelis...
Naraig Manjikian, Tarek S. Abdelrahman
ICPP
1995
IEEE
13 years 11 months ago
Fusion of Loops for Parallelism and Locality
Loop fusion improves data locality and reduces synchronization in data-parallel applications. However, loop fusion is not always legal. Even when legal, fusion may introduce loop-...
Naraig Manjikian, Tarek S. Abdelrahman
HIPC
2007
Springer
14 years 1 months ago
Direct Coherence: Bringing Together Performance and Scalability in Shared-Memory Multiprocessors
Traditional directory-based cache coherence protocols suffer from long-latency cache misses as a consequence of the indirection introduced by the home node, which must be accessed...
Alberto Ros, Manuel E. Acacio, José M. Garc...
EUROPAR
2005
Springer
14 years 29 days ago
A Novel Lightweight Directory Architecture for Scalable Shared-Memory Multiprocessors
There are two important hurdles that restrict the scalability of directory-based shared-memory multiprocessors: the directory memory overhead and the long L2 miss latencies due to ...
Alberto Ros, Manuel E. Acacio, José M. Garc...