Sciweavers

392 search results - page 26 / 79
» Fault Tolerance in a DSM Cluster Operating System
Sort
View
IPPS
2005
IEEE
14 years 2 months ago
Current Practice and a Direction Forward in Checkpoint/Restart Implementations for Fault Tolerance
Checkpoint/restart is a general idea for which particular implementations enable various functionalities in computer systems, including process migration, gang scheduling, hiberna...
José Carlos Sancho, Fabrizio Petrini, Kei D...
CLUSTER
2002
IEEE
14 years 1 months ago
BioOpera: Cluster-Aware Computing
In this paper we present BioOpera, an extensible process support system for cluster-aware computing. It features an intuitive way to specify computations, as well as improved supp...
Win Bausch, Cesare Pautasso, Reto Schaeppi, Gustav...
CLUSTER
2011
IEEE
12 years 8 months ago
Dynamic Load Balance for Optimized Message Logging in Fault Tolerant HPC Applications
—Computing systems will grow significantly larger in the near future to satisfy the needs of computational scientists in areas like climate modeling, biophysics and cosmology. S...
Esteban Meneses, Laxmikant V. Kalé, Greg Br...
USENIX
2008
13 years 11 months ago
Improving Scalability and Fault Tolerance in an Application Management Infrastructure
This paper explores the challenges associated with distributed application management in large-scale computing environments. In particular, we investigate several techniques for e...
Nikolay Topilski, Jeannie R. Albrecht, Amin Vahdat
ETFA
2006
IEEE
14 years 2 months ago
Fault Tolerance for Manufacturing Components
The more the information technologies begin to be incorporated into the industrial productive fabric, the more complex it becomes to organise them. It is vital to implant proactiv...
Diego Marcos-Jorquera, Francisco Maciá P&ea...