Sciweavers

2932 search results - page 126 / 587
» Optimizing Memory System Performance for Communication in Pa...
Sort
View
IPPS
2006
IEEE
14 years 3 months ago
A decomposition approach for optimizing the performance of MPI libraries
MPI provides a portable message passing interface for many parallel execution platforms but may lead to inefficiencies for some platforms and applications. In this article we sho...
O. Hartmann, Matthias Kühnemann, Thomas Raube...
LCTRTS
2005
Springer
14 years 2 months ago
Cache aware optimization of stream programs
Effective use of the memory hierarchy is critical for achieving high performance on embedded systems. We focus on the class of streaming applications, which is increasingly preval...
Janis Sermulins, William Thies, Rodric M. Rabbah, ...
CASES
2009
ACM
14 years 3 months ago
A buffer replacement algorithm exploiting multi-chip parallelism in solid state disks
Solid State Disks (SSDs) are superior to magnetic disks from a performance point of view due to the favorable features of NAND flash memory. Furthermore, thanks to improvement on...
Jinho Seol, Hyotaek Shim, Jaegeuk Kim, Seungryoul ...
PAAPP
2002
101views more  PAAPP 2002»
13 years 9 months ago
Static performance prediction of skeletal parallel programs
We demonstrate that the run time of implicitly parallel programs can be statically predicted with considerable accuracy when expressed within the constraints of a skeletal, shapel...
Yasushi Hayashi, Murray Cole
ICS
2009
Tsinghua U.
14 years 4 months ago
QuakeTM: parallelizing a complex sequential application using transactional memory
“Is transactional memory useful?” is the question that cannot be answered until we provide substantial applications that can evaluate its capabilities. While existing TM appli...
Vladimir Gajinov, Ferad Zyulkyarov, Osman S. Unsal...