Sciweavers

698 search results - page 58 / 140
» Synthesis of Fault-Tolerant Distributed Systems
Sort
View
HCW
1998
IEEE
14 years 15 hour ago
CCS Resource Management in Networked HPC Systems
CCS is a resource management system for parallel high-performance computers. At the user level, CCS provides vendor-independent access to parallel systems. At the system administr...
Axel Keller, Alexander Reinefeld
ICDCS
2012
IEEE
11 years 10 months ago
Combining Partial Redundancy and Checkpointing for HPC
Today’s largest High Performance Computing (HPC) systems exceed one Petaflops (1015 floating point operations per second) and exascale systems are projected within seven years...
James Elliott, Kishor Kharbas, David Fiala, Frank ...
OSDI
1996
ACM
13 years 9 months ago
Microkernels Meet Recursive Virtual Machines
This paper describes a novel approach to providingmodular and extensible operating system functionality and encapsulated environments based on a synthesis of microkernel and virtu...
Bryan Ford, Mike Hibler, Jay Lepreau, Patrick Tull...
NETGAMES
2006
ACM
14 years 1 months ago
Applying database replication to multi-player online games
Multi-player Online Games (MOGs) have emerged as popular data intensive applications in recent years. Being used by many players simultaneously, they require a high degree of faul...
Yi Lin, Bettina Kemme, Marta Patiño-Mart&ia...
ISCA
2010
IEEE
219views Hardware» more  ISCA 2010»
14 years 25 days ago
Using hardware vulnerability factors to enhance AVF analysis
Fault tolerance is now a primary design constraint for all major microprocessors. One step in determining a processor’s compliance to its failure rate target is measuring the Ar...
Vilas Sridharan, David R. Kaeli