Sciweavers

442 search results - page 54 / 89
» Fault Tolerant Wide-Area Parallel Computing
Sort
View
CLUSTER
2002
IEEE
13 years 9 months ago
Condor-G: A Computation Management Agent for Multi-Institutional Grids
In recent years, there has been a dramatic increase in the amount of available computing and storage resources. Yet few have been able to exploit these resources in an aggregated ...
James Frey, Todd Tannenbaum, Miron Livny, Ian T. F...
IPPS
2007
IEEE
14 years 3 months ago
DejaVu: Transparent User-Level Checkpointing, Migration, and Recovery for Distributed Systems
In this paper, we present a new fault tolerance system called DejaVu for transparent and automatic checkpointing, migration, and recovery of parallel and distributed applications....
Joseph F. Ruscio, Michael A. Heffner, Srinidhi Var...
GRID
2003
Springer
14 years 2 months ago
Faults in Grids: Why are they so bad and What can be done about it?
Computational Grids have the potential to become the main execution platform for high performance and distributed applications. However, such systems are extremely complex and pro...
Raissa Medeiros, Walfredo Cirne, Francisco Vilar B...
ICDCN
2010
Springer
14 years 4 months ago
Authenticated Byzantine Generals in Dual Failure Model
Pease et al. introduced the problem of Byzantine Generals (BGP) to study the effects of Byzantine faults in distributed protocols for reliable broadcast. It is well known that BG...
Anuj Gupta, Prasant Gopal, Piyush Bansal, Kannan S...
CCGRID
2001
IEEE
14 years 1 months ago
XtremWeb: A Generic Global Computing System
Global Computing achieves high throughput computing by harvesting a very large number of unused computing resources connected to the Internet. This parallel computing model target...
Gilles Fedak, Cécile Germain, Vincent N&eac...