This site uses cookies to deliver our services and to ensure you get the best experience. By continuing to use this site, you consent to our use of cookies and acknowledge that you have read and understand our Privacy Policy, Cookie Policy, and Terms
Minimizing communications when mapping affine loop nests onto distributed memory parallel computers has already drawn a lot of attention. This paper focuses on the next step: as i...
A concurrent partitioner for partitioning unstructured finite element meshes on distributed memory architectures is developed. The partitioner uses an element-based partitioning st...
This paper presents the design and implementation of a Sliding Memory Plane (SliM) Array Processor, a mesh-connected SIMD architecture. To build the array processor, we developed ...
- Partitionability allows the creation of many physically independent subsystems, each of which retains an identical functionality as its parent network and has no communication in...
Brewer and Kuszmaul [BK94] demonstrated how barriers and traffic interleaving can alleviate the problem of bulk-transfer performance degradation on the Thinking Machines CM-5, by ...
Eric A. Brewer, Paul Gauthier, Armando Fox, Angela...
We introduce dag consistency, a relaxed consistency model for distributed shared memory which is suitable for multithreaded programming. We have implemented dag consistency in sof...
Robert D. Blumofe, Matteo Frigo, Christopher F. Jo...
In a shared-memory multiprocessor system, it is possible that certain synchronization operations are redundant -that is, their corresponding sequencing requirements are enforced c...
Shuvra S. Bhattacharyya, Sundararajan Sriram, Edwa...
Advances in multiprocessor interconnect technologyare leading to high performance networks. However, software overheadsassociated with message passing are limiting the processors ...
Debashis Basak, Dhabaleswar K. Panda, Mohammad Ban...
This paper investigates methods to locate system resources, such as expensive hardware or software modules, to provide the most effective cost / performance tradeoffs in a torus p...