Clusters of commodity computers are currently being used to provide the scalability required by severalpopular Internet services. In this paper we evaluate an efficient cluster-b...
Enrique V. Carrera, Srinath Rao, Liviu Iftode, Ric...
Hiding communication latency is an important optimization for parallel programs. Programmers or compilers achieve this by using non-blocking communication primitives and overlappi...
Abstract. The functional performance model (FPM) of heterogeneous processors has proven to be more realistic than the traditional models because it integrates many important featur...
This paper presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performa...
TreadMarks is a distributed shared memory DSM system for standard Unix systems such as SunOS and Ultrix. This paper presents a performance evaluation of TreadMarks running on Ultr...
Peter J. Keleher, Alan L. Cox, Sandhya Dwarkadas, ...