Sciweavers

366 search results - page 46 / 74
» Evaluating the Performance of Skeleton-Based High Level Para...
Sort
View
IPPS
1998
IEEE
13 years 11 months ago
COMPaS: A Pentium Pro PC-based SMP Cluster and Its Experience
We have built an eight node SMP cluster called COMPaS (Cluster Of Multi-Processor Systems), each node of which is a quadprocessor Pentium Pro PC. We have designed and implemented a...
Yoshio Tanaka, Motohiko Matsuda, Makoto Ando, Kazu...
PVM
1998
Springer
13 years 11 months ago
SKaMPI: A Detailed, Accurate MPI Benchmark
Abstract. SKaMPI is a benchmark for MPI implementations. Its purpose is the detailed analysis of the runtime of individual MPI operations and comparison of these for di erent imple...
Ralf Reussner, Peter Sanders, Lutz Prechelt, Matth...
CC
2003
Springer
14 years 13 hour ago
Improving Data Locality by Chunking
Cache memories were invented to decouple fast processors from slow memories. However, this decoupling is only partial, and many researchers have attempted to improve cache use by p...
Cédric Bastoul, Paul Feautrier
PPAM
2005
Springer
14 years 8 days ago
A Web Computing Environment for Parallel Algorithms in Java
We present a web computing library (PUBWCL) in Java that allows to execute tightly coupled, massively parallel algorithms in the bulk-synchronous (BSP) style on PCs distributed ove...
Olaf Bonorden, Joachim Gehweiler, Friedhelm Meyer ...
IEEEPACT
2006
IEEE
14 years 24 days ago
Compiling for stream processing
This paper describes a compiler for stream programs that efficiently schedules computational kernels and stream memory operations, and allocates on-chip storage. Our compiler uses...
Abhishek Das, William J. Dally, Peter R. Mattson