Most cluster systems used in high performance computing do not allow process relocation at run-time. Finding an allocation that results in minimal completion time is NP-hard and (n...
This paper examines the effectiveness of decoupling as an optimization technique for high-performance computer architectures. Decoupled access execute architectures are described,...
Peter L. Bird, Alasdair Rawsthorne, Nigel P. Topha...
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout ...
We present the design and development of a Performance Data Representation System (PDRS) for scalable parallel computing. PDRS provides decision support that helps users find the r...
This paper describes an infrastructure that enables transparent development of image processing software for parallel computers. The infrastructure’s main component is an image ...