Sciweavers

799 search results - page 69 / 160
» On Failures and Faults
Sort
View
SIGMOD
2004
ACM
151views Database» more  SIGMOD 2004»
14 years 9 months ago
Highly-Available, Fault-Tolerant, Parallel Dataflows
We present a technique that masks failures in a cluster to provide high availability and fault-tolerance for long-running, parallelized dataflows. We can use these dataflows to im...
Mehul A. Shah, Joseph M. Hellerstein, Eric A. Brew...
PODC
2012
ACM
11 years 11 months ago
The cost of fault tolerance in multi-party communication complexity
Multi-party communication complexity involves distributed computation of a function over inputs held by multiple distributed players. A key focus of distributed computing research...
Binbin Chen, Haifeng Yu, Yuda Zhao, Phillip B. Gib...
MICRO
2009
IEEE
128views Hardware» more  MICRO 2009»
14 years 3 months ago
mSWAT: low-cost hardware fault detection and diagnosis for multicore systems
Continued technology scaling is resulting in systems with billions of devices. Unfortunately, these devices are prone to failures from various sources, resulting in even commodity...
Siva Kumar Sastry Hari, Man-Lap Li, Pradeep Ramach...
DATE
2007
IEEE
145views Hardware» more  DATE 2007»
14 years 3 months ago
Using an innovative SoC-level FMEA methodology to design in compliance with IEC61508
This paper proposes an innovative methodology to perform and validate a Failure Mode and Effects Analysis (FMEA) at System-on-Chip (SoC) level. This is done in compliance with the...
Riccardo Mariani, Gabriele Boschi, Federico Colucc...
HASE
1996
IEEE
14 years 28 days ago
Adaptive recovery for mobile environments
Mobile computing allows ubiquitous and continuousaccess to computing resources while the users travel or work at a client's site. The flexibility introduced by mobile computi...
Nuno Neves, W. Kent Fuchs