Sciweavers

901 search results - page 40 / 181
» Hiding Communication Latency in Data Parallel Applications
Sort
View
ICUMT
2009
13 years 5 months ago
Two-layer network coordinate system for Internet distance prediction
Network coordinate (NC) system is an efficient and scalable system for Internet distance prediction. In this paper, we propose 3 two-layer NC systems HNPS, HBBS and HIDES derived f...
Chengbo Dong, Guodong Wang, Xuan Zhang, Beixing De...
CCGRID
2008
IEEE
14 years 2 months ago
MPI Collectives on Modern Multicore Clusters: Performance Optimizations and Communication Characteristics
The advances in multicore technology and modern interconnects is rapidly accelerating the number of cores deployed in today’s commodity clusters. A majority of parallel applicat...
Amith R. Mamidala, Rahul Kumar, Debraj De, Dhabale...
ICPP
2005
IEEE
14 years 1 months ago
LiMIC: Support for High-Performance MPI Intra-node Communication on Linux Cluster
High performance intra-node communication support for MPI applications is critical for achieving best performance from clusters of SMP workstations. Present day MPI stacks cannot ...
Hyun-Wook Jin, Sayantan Sur, Lei Chai, Dhabaleswar...
ICS
2005
Tsinghua U.
14 years 1 months ago
Parallel sparse LU factorization on second-class message passing platforms
Several message passing-based parallel solvers have been developed for general (non-symmetric) sparse LU factorization with partial pivoting. Due to the fine-grain synchronizatio...
Kai Shen
FPL
2006
Springer
242views Hardware» more  FPL 2006»
13 years 11 months ago
TMD-MPI: An MPI Implementation for Multiple Processors Across Multiple FPGAs
With current FPGAs, designers can now instantiate several embedded processors, memory units, and a wide variety of IP blocks to build a single-chip, high-performance multiprocesso...
Manuel Saldaña, Paul Chow