Sciweavers

1431 search results - page 103 / 287
» Analytical Performance Models of Parallel Programs in Cluste...
Sort
View
IPPS
2002
IEEE
14 years 1 months ago
Portals 3.0: Protocol Building Blocks for Low Overhead Communication
This paper describes the evolution of the Portals message passing architecture and programming interface from its initial development on tightly-coupled massively parallel platfor...
Ron Brightwell, William Lawry, Arthur B. Maccabe, ...
PPOPP
2009
ACM
14 years 9 months ago
Exploiting global optimizations for openmp programs in the openuh compiler
The advent of new parallel architectures has increased the need for parallel optimizing compilers to assist developers in creating efficient code. OpenUH is a state-of-the-art opt...
Lei Huang, Deepak Eachempati, Marcus W. Hervey, Ba...
EUROPAR
1997
Springer
14 years 1 months ago
Prefetching and Multithreading Performance in Bus-Based Multiprocessors with Petri Nets
The large latency of memory accesses is a major obstacle in obtaining high processor utilization in large scale shared-memory multiprocessors. Access to remote memory is likely to ...
Edward D. Moreno, Sergio Takeo Kofuji, Marcelo H. ...
IPPS
2009
IEEE
14 years 3 months ago
Application profiling on Cell-based clusters
In this paper, we present a methodology for profiling parallel applications executing on the IBM PowerXCell 8i (commonly referred to as the “Cell” processor). Specifically, we...
Hikmet Dursun, Kevin J. Barker, Darren J. Kerbyson...
PROCEDIA
2010
148views more  PROCEDIA 2010»
13 years 3 months ago
SysCellC: a data-flow programming model on multi-GPU
High performance computing with low cost machines becomes a reality with GPU. Unfortunately, high performances are achieved when the programmer exploits the architectural specific...
Dominique Houzet, Sylvain Huet, Anis Rahman