In this paper, we present a new fault tolerance system called DejaVu for transparent and automatic checkpointing, migration, and recovery of parallel and distributed applications....
Joseph F. Ruscio, Michael A. Heffner, Srinidhi Var...
Large scale systems such as the Grid need scalable and efficient resource allocation mechanisms to fulfil the requirements of its participants and applications while the whole s...
The amount of Task Level Parallelism (TLP) in runtime workload is useful information to determine the efficient usage of multiprocessors. This paper presents mechanisms to dynami...
This paper discusses the design, development, and use of a performance monitoring tool for Distributed Interactive Simulations (DIS). A typical DIS environment consists of hundred...
GlobalWatch is a distributed platform to monitor various resources of grid platforms so as to improve the flexibility and usability of grid systems. In order to enhance the flexib...
Sheng Di, Hai Jin, Shengli Li, Ling Chen, Chengwei...