We present a parallel code generation algorithm for complete applications and a new experimental methodology that tests the efficacy of our approach. The algorithm optimizes for d...
Abstract. We present a method to enhance wormhole routing algorithms for deadlock-free fault-tolerant routing in tori. We consider arbitrarily-located faulty blocks and assume only...
On large distributed memory parallel computers the global communication cost of inner products seriously limits the performance of Krylov subspace methods 3]. We consider improved ...
The exact knowledge of the heat flow in heterojunction bipolar transistors (HBT) during power operation is an important key factor for the systematic improvement of power density,...
Abstract. High Performance Fortran (hpf) is a data-parallel Fortran for Distributed Memory Multiprocessors. Hpf provides an interesting programming model but compilers are yet to c...
Many standardized hardware communication interfaces offer runtime flexibility and configurability at the cost of efficiency. An alternate approach is the use of a highly-effic...
Steve Ward, Karim Abdalla, Rajeev Dujari, Michael ...
—Data dependence analysis techniques are the main component of today’s strategies for automatic detection of parallelism. Parallelism detection strategies are being incorporate...