Vector prefix and reduction are collective communication primitives in which all processors must cooperate. We present two parallel algorithms, the direct algorithm and the split ...
We discuss a parallel algorithm for the solution of large-scale generalized algebraic Riccati equations with dimension up to O(105 ). We survey the numerical algorithms underlying ...
Abstract. Lazy task creation (LTC) is an e cient approach for executing divide and conquer parallel programs that has been used in the implementation of Multilisp's future con...
—Parallel performance monitoring extends parallel measurement systems with infrastructure and interfaces for online performance data access, communication, and analysis. At the s...
Aroon Nataraj, Allen D. Malony, Allen Morris, Dori...
Event traces are helpful in understanding the performance behavior of message-passing applications since they allow in-depth analyses of communication and synchronization patterns...
Daniel Becker, John C. Linford, Rolf Rabenseifner,...