Search Sciweavers | Sciweavers

392 search results - page 57 / 79

» Fault Tolerance in a DSM Cluster Operating System

175

click to vote

DAC
2011
ACM

215views Computer Architecture» more DAC 2011»

DRAIN: distributed recovery architecture for inaccessible nodes in multi-core chips

14 years 5 months ago

Download www.mit.edu

As transistor dimensions continue to scale deep into the nanometer regime, silicon reliability is becoming a chief concern. At the same time, transistor counts are scaling up, ena...

Andrew DeOrio, Konstantinos Aisopos, Valeria Berta...

claim paper

Read More »

151

Voted

MIDDLEWARE
2009
Springer

139views Distributed And Parallel Com...» more MIDDLEWARE 2009»

Why Do Upgrades Fail and What Can We Do about It?

15 years 11 months ago

Download www.ece.cmu.edu

Abstract. Enterprise-system upgrades are unreliable and often produce downtime or data-loss. Errors in the upgrade procedure, such as broken dependencies, constitute the leading ca...

Tudor Dumitras, Priya Narasimhan

claim paper

Read More »

178

Voted

ISORC
2003
IEEE

167views Distributed And Parallel Com...» more ISORC 2003»

A Dynamic Shadow Approach for Mobile Agents to Survive Crash Failures

15 years 10 months ago

Download www.comp.leeds.ac.uk

Fault tolerance schemes for mobile agents to survive agent server crash failures are complex since developers normally have no control over remote agent servers. Some solutions mo...

Simon Pears, Jie Xu, Cornelia Boldyreff

claim paper

Read More »

134

click to vote

ISCA
2010
IEEE

170views Hardware» more ISCA 2010»

Relax: an architectural framework for software recovery of hardware faults

15 years 10 months ago

Download www.cs.wisc.edu

As technology scales ever further, device unreliability is creating excessive complexity for hardware to maintain the illusion of perfect operation. In this paper, we consider whe...

Marc de Kruijf, Shuou Nomura, Karthikeyan Sankaral...

claim paper

Read More »

138

Voted

SIGMETRICS
2008
ACM

121views Hardware» more SIGMETRICS 2008»

Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems

15 years 5 months ago

Download www.zurich.ibm.com

Two schemes proposed to cope with unrecoverable or latent media errors and enhance the reliability of RAID systems are examined. The first scheme is the established, widely used d...

Ilias Iliadis, Robert Haas, Xiao-Yu Hu, Evangelos ...

claim paper

Read More »

« Prev « First page 57 / 79 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers