Sciweavers

481 search results - page 79 / 97
» Performance Modeling and Measurement of Parallelized Code fo...
Sort
View
ICS
2005
Tsinghua U.
15 years 8 months ago
Optimization of MPI collective communication on BlueGene/L systems
BlueGene/L is currently the world’s fastest supercomputer. It consists of a large number of low power dual-processor compute nodes interconnected by high speed torus and collect...
George Almási, Philip Heidelberger, Charles...
SPAA
1996
ACM
15 years 6 months ago
From AAPC Algorithms to High Performance Permutation Routing and Sorting
Several recent papers have proposed or analyzed optimal algorithms to route all-to-all personalizedcommunication (AAPC) over communication networks such as meshes, hypercubes and ...
Thomas Stricker, Jonathan C. Hardwick
IPPS
2006
IEEE
15 years 8 months ago
Improving cooperation in peer-to-peer systems using social networks
Rational and selfish nodes in P2P systems usually lack effective incentives to cooperate, contributing to the increase of free-riders, and degrading the system performance. Variou...
Wenyu Wang, Li Zhao, Ruixi Yuan
IPPS
1999
IEEE
15 years 6 months ago
Dynamically Scheduling the Trace Produced During Program Execution into VLIW Instructions
VLIW machines possibly provide the most direct way to exploit instruction level parallelism; however, they cannot be used to emulate current general-purpose instruction set archit...
Alberto Ferreira de Souza, Peter Rounce
HPCA
1998
IEEE
15 years 6 months ago
The Effectiveness of SRAM Network Caches in Clustered DSMs
The frequency of accesses to remote data is a key factor affecting the performance of all Distributed Shared Memory (DSM) systems. Remote data caching is one of the most effective...
Adrian Moga, Michel Dubois