In this paper, we present adaptive fault-tolerant deadlock-free routing algorithms for hypercubes and meshes by using only 3 virtual channels and 2 virtual channels respectively. ...
Barrier synchronization is a crucial operation for parallel systems. Many schemes have been proposed in the literature to achieve fast barrier synchronization through software, ha...
Rajeev Sivaram, Craig B. Stunkel, Dhabaleswar K. P...
Improvements in optical technology will enable the constructionof high bandwidth, low latencyswitching networks. These networks have many applications in massively parallel proces...
This paper presents a compile-time scheme for partitioning non-rectangular loop nests which consist of inner loops whose bounds depend on the index of the outermost, parallel loop...
Algebraic factorization is an extremely important part of any logic synthesis system but is computationally expensive. Hence it is important to look at parallel processing to spee...
In this paper, we present two new run-time algorithms for the parallelization of loops that have indirect access patterns. The algorithms can handle any type of loop-carried depen...
Tools for parallel systems today range from specification over debugging to performance analysis and more. Typically, they help the programmers of parallel algorithms from the ea...
Several optimizations to the Time Warp synchronization protocol for parallel discrete event simulation have been proposed and studied. Many of these optimizations have included so...
Radharamanan Radhakrishnan, Lantz Moore, Philip A....
The large design space of modern computer architectures calls for performance modelling tools to facilitate the evaluation of different alternatives. In this paper, we give an ove...