Sciweavers

392 search results - page 32 / 79
» Fault Tolerance in a DSM Cluster Operating System
Sort
View
CLUSTER
2000
IEEE
14 years 1 months ago
SilkRoad: A Multithreaded Runtime System with Software Distributed Shared Memory for SMP Clusters
Multithreaded parallel system with software Distributed Shared Memory (DSM) is an attractive direction in cluster computing. In these systems, distributing workloads and keeping t...
Liang Peng, Weng-Fai Wong, Ming-Dong Feng, Chung-K...
SRDS
2007
IEEE
14 years 3 months ago
Customizable Fault Tolerance for Wide-Area Replication
Constructing logical machines out of collections of physical machines is a well-known technique for improving the robustness and fault tolerance of distributed systems. We present...
Yair Amir, Brian A. Coan, Jonathan Kirsch, John La...
CCGRID
2008
IEEE
14 years 3 months ago
A Technique for Lock-Less Mirroring in Parallel File Systems
—As parallel file systems span larger and larger numbers of nodes in order to provide the performance and scalability necessary for modern cluster applications, the need for fau...
Bradley W. Settlemyer, Walter B. Ligon III
CLOUD
2010
ACM
14 years 1 months ago
Robust and flexible power-proportional storage
Power-proportional cluster-based storage is an important component of an overall cloud computing infrastructure. With it, substantial subsets of nodes in the storage cluster can b...
Hrishikesh Amur, James Cipar, Varun Gupta, Gregory...
HICSS
2005
IEEE
149views Biometrics» more  HICSS 2005»
14 years 2 months ago
Fault Analysis of a Distributed Flight Control System
This paper presents how state consistency among distributed control nodes is maintained in the presence of faults. We analyze a fault tolerant semi-synchronous architecture concep...
Kristina Forsberg, Simin Nadjm-Tehrani, Jan Torin