Sciweavers

901 search results - page 90 / 181
» Hiding Communication Latency in Data Parallel Applications
Sort
View
HOTI
2011
IEEE
12 years 7 months ago
The Common Communication Interface (CCI)
—There are many APIs for connecting and exchanging data between network peers. Each interface varies wildly based on metrics including performance, portability, and complexity. S...
Scott Atchley, David Dillow, Galen M. Shipman, Pat...
EUROPAR
2008
Springer
13 years 9 months ago
MPC: A Unified Parallel Runtime for Clusters of NUMA Machines
Over the last decade, Message Passing Interface (MPI) has become a very successful parallel programming environment for distributed memory architectures such as clusters. However, ...
Marc Pérache, Hervé Jourdren, Raymon...
MICRO
2007
IEEE
164views Hardware» more  MICRO 2007»
14 years 2 months ago
A Practical Approach to Exploiting Coarse-Grained Pipeline Parallelism in C Programs
The emergence of multicore processors has heightened the need for effective parallel programming practices. In addition to writing new parallel programs, the next generation of pr...
William Thies, Vikram Chandrasekhar, Saman P. Amar...
ICS
1993
Tsinghua U.
14 years 1 days ago
The EM-4 Under Implicit Parallelism
: The EM-4 is a supercomputer that offers very fast inter processor communication and support for multi threading. In this paper we demonstrate that the EM-4, Together with an auto...
Lubomir Bic, Mayez A. Al-Mouhamed
IPPS
2006
IEEE
14 years 2 months ago
Dual-layered file cache on cc-NUMA system
CC-NUMA is a widely adopted and deployed architecture of high performance computers. These machines are attractive for their transparent access to local and remote memory. However...
Zhou Yingchao, Meng Dan, Ma Jie