Sciweavers

836 search results - page 132 / 168
» Performance Evaluation of View-Oriented Parallel Programming...
Sort
View
165
Voted
USENIX
1994
15 years 5 months ago
TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems
TreadMarks is a distributed shared memory DSM system for standard Unix systems such as SunOS and Ultrix. This paper presents a performance evaluation of TreadMarks running on Ultr...
Peter J. Keleher, Alan L. Cox, Sandhya Dwarkadas, ...
IPPS
2006
IEEE
15 years 10 months ago
Reducing the associativity and size of step caches in CRCW operation
Step caches are caches in which data entered to an cache array is kept valid only until the end of ongoing step of execution. Together with an advanced pipelined multithreaded arc...
M. Forsell
142
Voted
SPAA
2010
ACM
15 years 4 months ago
Buffer-space efficient and deadlock-free scheduling of stream applications on multi-core architectures
We present a scheduling algorithm of stream programs for multi-core architectures called team scheduling. Compared to previous multi-core stream scheduling algorithms, team schedu...
JongSoo Park, William J. Dally
IPPS
2007
IEEE
15 years 10 months ago
Nonuniformly Communicating Noncontiguous Data: A Case Study with PETSc and MPI
Due to the complexity associated with developing parallel applications, scientists and engineers rely on highlevel software libraries such as PETSc, ScaLAPACK and PESSL to ease th...
Pavan Balaji, Darius Buntinas, Satish Balay, Barry...
165
Voted
CLUSTER
2008
IEEE
15 years 6 months ago
Efficient one-copy MPI shared memory communication in Virtual Machines
Efficient intra-node shared memory communication is important for High Performance Computing (HPC), especially with the emergence of multi-core architectures. As clusters continue ...
Wei Huang, Matthew J. Koop, Dhabaleswar K. Panda