Sciweavers

656 search results - page 59 / 132
» Scalable Parallel Matrix Multiplication on Distributed Memor...
Sort
View
IPPS
1996
IEEE
13 years 11 months ago
An Element-Based Concurrent Partitioner for Unstructured Finite Element Meshes
A concurrent partitioner for partitioning unstructured finite element meshes on distributed memory architectures is developed. The partitioner uses an element-based partitioning st...
Hong Q. Ding, Robert D. Ferraro
IPPS
2003
IEEE
14 years 28 days ago
Optimizing Synchronization Operations for Remote Memory Communication Systems
Synchronization operations, such as fence and locking, are used in many parallel operations accessing shared memory. However, a process which is blocked waiting for a fence operat...
Darius Buntinas, Amina Saify, Dhabaleswar K. Panda...
CLUSTER
2007
IEEE
14 years 2 months ago
The design of MPI based distributed shared memory systems to support OpenMP on clusters
— OpenMP can be supported in cluster environments by using distributed shared memory (DSM) systems. A portable approach for building DSM systems is to layer it on MPI. With these...
H'sien J. Wong, Alistair P. Rendell
ICCS
2009
Springer
14 years 2 months ago
Generating Empirically Optimized Composed Matrix Kernels from MATLAB Prototypes
The development of optimized codes is time-consuming and requires extensive architecture, compiler, and language expertise, therefore, computational scientists are often forced to ...
Boyana Norris, Albert Hartono, Elizabeth R. Jessup...
IPPS
2007
IEEE
14 years 1 months ago
Domain Decomposition vs. Master-Slave in Apparently Homogeneous Systems
This paper investigates the utilization of the master-slave (MS) paradigm as an alternative to domain decomposition (DD) methods for parallelizing lattice gauge theory (LGT) model...
Cyril Banino-Rokkones