Sciweavers

392 search results - page 52 / 79
» Fault Tolerance in a DSM Cluster Operating System
Sort
View
ICDCS
2011
IEEE
12 years 8 months ago
Smart Redundancy for Distributed Computation
Many distributed software systems allow participation by large numbers of untrusted, potentially faulty components on an open network. As faults are inevitable in this setting, th...
Yuriy Brun, George Edwards, Jae Young Bang, Nenad ...
ICAS
2005
IEEE
155views Robotics» more  ICAS 2005»
14 years 2 months ago
Analyzing the Impact of Components Replication in High Available J2EE Clusters
Clustering is a well known technique that allows scalability and fault tolerance in distributed systems. In the J2EE framework, clustering can be used to improve the performance a...
Davide Rossi, Elisa Turrini
SIGCOMM
2012
ACM
11 years 11 months ago
HyperDex: a distributed, searchable key-value store
Distributed key-value stores are now a standard component of high-performance web services and cloud computing applications. While key-value stores offer significant performance...
Robert Escriva, Bernard Wong, Emin Gün Sirer
DSN
2004
IEEE
14 years 11 days ago
Cluster-Based Failure Detection Service for Large-Scale Ad Hoc Wireless Network Applications
The growing interest in ad hoc wireless network applications that are made of large and dense populations of lightweight system resources calls for scalable approaches to fault to...
Ann T. Tai, Kam S. Tso, William H. Sanders
CLUSTER
2002
IEEE
13 years 8 months ago
Condor-G: A Computation Management Agent for Multi-Institutional Grids
In recent years, there has been a dramatic increase in the amount of available computing and storage resources. Yet few have been able to exploit these resources in an aggregated ...
James Frey, Todd Tannenbaum, Miron Livny, Ian T. F...