Sciweavers

2400 search results - page 202 / 480
» Systems Failures
Sort
View
JPDC
2006
112views more  JPDC 2006»
13 years 9 months ago
CEFT: A cost-effective, fault-tolerant parallel virtual file system
The vulnerability of computer nodes due to component failures is a critical issue for cluster-based file systems. This paper studies the development and deployment of mirroring in...
Yifeng Zhu, Hong Jiang
ISPDC
2003
IEEE
14 years 2 months ago
Lightweight Logging and Recovery for Distributed Shared Memory over Virtual Interface Architecture
As software Distributed Shared Memory(DSM) systems become attractive on larger clusters, the focus of attention moves toward improving the reliability of systems. In this paper, w...
Soyeon Park, Youngjae Kim, Seung Ryoul Maeng
WORDS
2005
IEEE
14 years 3 months ago
Towards Self-Healing Systems via Dependable Architecture and
Self-healing systems focus on how to reducing the complexity and cost of the management of dependability policies and mechanisms without human intervention. This position paper pr...
Hong Mei, Gang Huang, Wei-Tek Tsai
OSDI
2008
ACM
14 years 10 months ago
CuriOS: Improving Reliability through Operating System Structure
An error that occurs in a microkernel operating system service can potentially result in state corruption and service failure. A simple restart of the failed service is not always...
Francis M. David, Ellick Chan, Jeffrey C. Carlyle,...
DSN
2009
IEEE
14 years 1 months ago
Evaluating the impact of Undetected Disk Errors in RAID systems
Despite the reliability of modern disks, recent studies have made it clear that a new class of faults, Undetected Disk Errors (UDEs) also known as silent data corruption events, b...
Eric Rozier, Wendy Belluomini, Veera Deenadhayalan...