Sciweavers

104 search results - page 11 / 21
» Evaluation of Compiler and Runtime Library Approaches for Su...
Sort
View
LCTRTS
2005
Springer
14 years 27 days ago
Generation of permutations for SIMD processors
Short vector (SIMD) instructions are useful in signal processing, multimedia, and scientific applications. They offer higher performance, lower energy consumption, and better res...
Alexei Kudriavtsev, Peter M. Kogge
ICFP
2005
ACM
14 years 7 months ago
AtomCaml: first-class atomicity via rollback
We have designed, implemented, and evaluated AtomCaml, an extension to Objective Caml that provides a synchronization primitive for atomic (transactional) execution of code. A fir...
Michael F. Ringenburg, Dan Grossman
HPDC
1996
IEEE
13 years 11 months ago
Customized Dynamic Load Balancing for a Network of Workstations
Load balancing involves assigning to each processor, work proportional to its performance, minimizing the execution time of the program. Althoughstatic load balancing can solve ma...
Mohammed Javeed Zaki, Wei Li, Srinivasan Parthasar...
PDCAT
2007
Springer
14 years 1 months ago
A Distributed Virtual Machine for Parallel Graph Reduction
We present the architecture of nreduce, a distributed virtual machine which uses parallel graph reduction to run programs across a set of computers. It executes code written in a ...
Peter M. Kelly, Paul D. Coddington, Andrew L. Wend...
IPPS
2007
IEEE
14 years 1 months ago
Nonuniformly Communicating Noncontiguous Data: A Case Study with PETSc and MPI
Due to the complexity associated with developing parallel applications, scientists and engineers rely on highlevel software libraries such as PETSc, ScaLAPACK and PESSL to ease th...
Pavan Balaji, Darius Buntinas, Satish Balay, Barry...