Sciweavers

131 search results - page 14 / 27
» Routing in Modular Fault Tolerant Multiprocessor Systems
Sort
View
ISCA
2002
IEEE
115views Hardware» more  ISCA 2002»
14 years 1 months ago
SafetyNet: Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery
We develop an availability solution, called SafetyNet, that uses a unified, lightweight checkpoint/recovery mechanism to support multiple long-latency fault detection schemes. At...
Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill, ...
SIGMETRICS
2010
ACM
201views Hardware» more  SIGMETRICS 2010»
14 years 1 months ago
Transparent, lightweight application execution replay on commodity multiprocessor operating systems
We present S, the first system to provide transparent, lowoverhead application record-replay and the ability to go live from replayed execution. S i...
Oren Laadan, Nicolas Viennot, Jason Nieh
ICMAS
2000
13 years 10 months ago
The Adaptive Agent Architecture: Achieving Fault-Tolerance Using Persistent Broker Teams
Brokers are used in many multi-agent systems for locating agents, for routing and sharing information, for managing the system, and for legal purposes, as independent third partie...
Sanjeev Kumar, Philip R. Cohen, Hector J. Levesque
HASE
1998
IEEE
14 years 1 months ago
Combining Various Solution Techniques for Dynamic Fault Tree Analysis of Computer Systems
Fault trees provide a graphical and logical framework for analyzing the reliability of systems. A fault tree provides a conceptually simple modeling framework to represent the sys...
Ragavan Manian, Joanne Bechta Dugan, David Coppit,...
DSN
2007
IEEE
14 years 3 months ago
Utilizing Dynamically Coupled Cores to Form a Resilient Chip Multiprocessor
Aggressive CMOS scaling will make future chip multiprocessors (CMPs) increasingly susceptible to transient faults, hard errors, manufacturing defects, and process variations. Exis...
Christopher LaFrieda, Engin Ipek, José F. M...