Sciweavers

212 search results - page 25 / 43
» Model-based fault localization in large-scale computing syst...
Sort
View
DSN
2000
IEEE
14 years 9 days ago
Data Replication Strategies for Fault Tolerance and Availability on Commodity Clusters
Recent work has shown the advantages of using persistent memory for transaction processing. In particular, the Vista transaction system uses recoverable memory to avoid disk I/O, ...
Cristiana Amza, Alan L. Cox, Willy Zwaenepoel
ICRA
2006
IEEE
120views Robotics» more  ICRA 2006»
14 years 1 months ago
Distributed Diagnosis of Coupled Mobile Robots
— Fault diagnosis of coupled mobile robots requires a large number of measurements to be communicated either between the robots or from the robots to a central diagnoser. As comp...
Matthew J. Daigle, Xenofon D. Koutsoukos, Gautam B...
MIDDLEWARE
2009
Springer
14 years 2 months ago
Why Do Upgrades Fail and What Can We Do about It?
Abstract. Enterprise-system upgrades are unreliable and often produce downtime or data-loss. Errors in the upgrade procedure, such as broken dependencies, constitute the leading ca...
Tudor Dumitras, Priya Narasimhan
MIDDLEWARE
2004
Springer
14 years 1 months ago
Architecture for resource allocation services supporting interactive remote desktop sessions in utility grids
Emerging large scale utility computing systems like Grids promise computing and storage to be provided to end users as a utility. System management services deployed in the middle...
Vanish Talwar, Bikash Agarwalla, Sujoy Basu, Raj K...
GECCO
2007
Springer
183views Optimization» more  GECCO 2007»
13 years 11 months ago
Evolving distributed agents for managing air traffic
Air traffic management offers an intriguing real world challenge to designing large scale distributed systems using evolutionary computation. The ability to evolve effective air t...
Adrian K. Agogino, Kagan Tumer