Sciweavers

481 search results - page 79 / 97
» Performance Modeling and Measurement of Parallelized Code fo...
Sort
View
ICS
2005
Tsinghua U.
14 years 1 months ago
Optimization of MPI collective communication on BlueGene/L systems
BlueGene/L is currently the world’s fastest supercomputer. It consists of a large number of low power dual-processor compute nodes interconnected by high speed torus and collect...
George Almási, Philip Heidelberger, Charles...
SPAA
1996
ACM
13 years 11 months ago
From AAPC Algorithms to High Performance Permutation Routing and Sorting
Several recent papers have proposed or analyzed optimal algorithms to route all-to-all personalizedcommunication (AAPC) over communication networks such as meshes, hypercubes and ...
Thomas Stricker, Jonathan C. Hardwick
IPPS
2006
IEEE
14 years 1 months ago
Improving cooperation in peer-to-peer systems using social networks
Rational and selfish nodes in P2P systems usually lack effective incentives to cooperate, contributing to the increase of free-riders, and degrading the system performance. Variou...
Wenyu Wang, Li Zhao, Ruixi Yuan
IPPS
1999
IEEE
13 years 12 months ago
Dynamically Scheduling the Trace Produced During Program Execution into VLIW Instructions
VLIW machines possibly provide the most direct way to exploit instruction level parallelism; however, they cannot be used to emulate current general-purpose instruction set archit...
Alberto Ferreira de Souza, Peter Rounce
HPCA
1998
IEEE
13 years 11 months ago
The Effectiveness of SRAM Network Caches in Clustered DSMs
The frequency of accesses to remote data is a key factor affecting the performance of all Distributed Shared Memory (DSM) systems. Remote data caching is one of the most effective...
Adrian Moga, Michel Dubois