Abstract Metacomputing is the seamless application of geographically-separated distributed computing resources to user applications. We consider the scheduling of metaapplications;...
In this paper, we argue for the power of providing a common set of OS services to wide area applications, including mechanisms for resource discovery, a global namespace, remote p...
Amin Vahdat, Thomas E. Anderson, Michael Dahlin, E...
We describe a methodology that enables the real-time diagnosis of performance problems in complex high-performance distributed systems. The methodology includes tools for generati...
Brian Tierney, William E. Johnston, Brian Crowley,...
The potential for faults in distributed computing systems is a significant complicating factor for application developers. While a variety of techniques exist for detecting and co...
Paul Stelling, Ian T. Foster, Carl Kesselman, Crai...
This paper introduces Strings, a high performance distributed shared memory system designed for clusters of symmetrical multiprocessors (SMPs). The distinguishing feature of this ...
Conventional resource management systems use a system model to describe resources and a centralized scheduler to control their allocation. We argue that this paradigm does not ada...
We propose a new parallel, noncollective I/O strategy called Distant I/O that targets clustered computer systems in which disks are attached to compute nodes. Distant I/O allows o...
To support a vast set of user requirements, flexibility and extensibility are essential features of a metasystem architecture. We present an execution model, the Reflective Graph ...
Anh Nguyen-Tuong, Steve J. Chapin, Andrew S. Grims...