Short vector (SIMD) instructions are useful in signal processing, multimedia, and scientific applications. They offer higher performance, lower energy consumption, and better res...
We have designed, implemented, and evaluated AtomCaml, an extension to Objective Caml that provides a synchronization primitive for atomic (transactional) execution of code. A fir...
Load balancing involves assigning to each processor, work proportional to its performance, minimizing the execution time of the program. Althoughstatic load balancing can solve ma...
Mohammed Javeed Zaki, Wei Li, Srinivasan Parthasar...
We present the architecture of nreduce, a distributed virtual machine which uses parallel graph reduction to run programs across a set of computers. It executes code written in a ...
Peter M. Kelly, Paul D. Coddington, Andrew L. Wend...
Due to the complexity associated with developing parallel applications, scientists and engineers rely on highlevel software libraries such as PETSc, ScaLAPACK and PESSL to ease th...
Pavan Balaji, Darius Buntinas, Satish Balay, Barry...