Sciweavers

275 search results - page 9 / 55
» Dynamic and Fault-tolerant Cluster Management
Sort
View
HIPC
2009
Springer
13 years 7 months ago
Fast checkpointing by Write Aggregation with Dynamic Buffer and Interleaving on multicore architecture
Large scale compute clusters continue to grow to ever-increasing proportions. However, as clusters and applications continue to grow, the Mean Time Between Failures (MTBF) has redu...
Xiangyong Ouyang, Karthik Gopalakrishnan, Tejus Ga...
PVM
2009
Springer
14 years 4 months ago
VolpexMPI: An MPI Library for Execution of Parallel Applications on Volatile Nodes
The objective of this research is to convert ordinary idle PCs into virtual clusters for executing parallel applications. The paper introduces VolpexMPI that is designed to enable ...
Troy LeBlanc, Rakhi Anand, Edgar Gabriel, Jaspal S...
CLUSTER
2002
IEEE
13 years 9 months ago
Condor-G: A Computation Management Agent for Multi-Institutional Grids
In recent years, there has been a dramatic increase in the amount of available computing and storage resources. Yet few have been able to exploit these resources in an aggregated ...
James Frey, Todd Tannenbaum, Miron Livny, Ian T. F...
AINA
2004
IEEE
14 years 1 months ago
An Efficient Clustered Architecture for P2P Networks
Peer-to-peer (P2P) computing offers many attractive features, such as self-organization, load-balancing, availability, fault tolerance, and anonymity. However, it also faces some ...
Juan Li, Son T. Vuong
EMSOFT
2007
Springer
14 years 4 months ago
A dynamic scheduling approach to designing flexible safety-critical systems
The design of safety-critical systems has typically adopted static techniques to simplify error detection and fault tolerance. However, economic pressure to reduce costs is exposi...
Luís Almeida, Sebastian Fischmeister, Madhu...