— Many of the issues that will be faced by the designers of multi-billion transistor chips may be alleviated by the presence of a flexible global communication infrastructure. I...
Buffered CoScheduled MPI (BCS-MPI) introduces a new approach to design the communication layer for largescale parallel machines. The emphasis of BCS-MPI is on the global coordinat...
Abstract—This paper presents a novel scalable switching architecture for input queued switches with its proper arbitration algorithms. In contrast to traditional switching archit...
This paper presents a novel technique to perform global optimization of communication and preprocessing calls in the presence of array accesses with arbitrary subscripts. Our sche...
High performance computing on parallel architectures currently uses different approaches depending on the hardory model of the architecture, the abstraction level of the programmi...