Sciweavers

847 search results - page 136 / 170
» Counting Polyominoes: A Parallel Implementation for Cluster ...
Sort
View
IJHPCA
2010
105views more  IJHPCA 2010»
13 years 6 months ago
A Pipelined Algorithm for Large, Irregular All-Gather Problems
We describe and evaluate a new, pipelined algorithm for large, irregular all-gather problems. In the irregular all-gather problem each process in a set of processes contributes in...
Jesper Larsson Träff, Andreas Ripke, Christia...
SOSP
1997
ACM
13 years 9 months ago
Towards Transparent and Efficient Software Distributed Shared Memory
Despite a large research effort, software distributed shared memory systems have not been widely used to run parallel applications across clusters of computers. The higher perform...
Daniel J. Scales, Kourosh Gharachorloo
IPPS
2008
IEEE
14 years 2 months ago
Providing flow based performance guarantees for buffered crossbar switches
Buffered crossbar switches are a special type of combined input-output queued switches with each crosspoint of the crossbar having small on-chip buffers. The introduction of cross...
Deng Pan, Yuanyuan Yang
CLUSTER
2009
IEEE
14 years 13 days ago
Finding a tradeoff between host interrupt load and MPI latency over Ethernet
—Achieving high-performance message passing on top of generic ETHERNET hardware suffers from the NIC interruptdriven model where coalescing is usually involved. We present an in-...
Brice Goglin, Nathalie Furmento
PLDI
2011
ACM
12 years 10 months ago
Automatic compilation of MATLAB programs for synergistic execution on heterogeneous processors
MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typ...
Ashwin Prasad, Jayvant Anantpur, R. Govindarajan