Sciweavers

392 search results - page 54 / 79
» Fault Tolerance in a DSM Cluster Operating System
Sort
View
SRDS
1998
IEEE
14 years 27 days ago
Optimization of a Real-Time Primary-Backup Replication Service
The primary-backup replication model is one of the commonly adopted approaches to providing fault tolerant data services. Its extension to the real-time environment, however, impo...
Hengming Zou, Farnam Jahanian
SOSP
1997
ACM
13 years 10 months ago
Distributed Schedule Management in the Tiger Video Fileserver
Tiger is a scalable, fault-tolerant video file server constructed from a collection of computers connected by a switched network. All content files are striped across all of the c...
William J. Bolosky, Robert P. Fitzgerald, John R. ...
ICPP
2007
IEEE
14 years 2 months ago
A Meta-Learning Failure Predictor for Blue Gene/L Systems
The demand for more computational power in science and engineering has spurred the design and deployment of ever-growing cluster systems. Even though the individual components use...
Prashasta Gujrati, Yawei Li, Zhiling Lan, Rajeev T...
HCW
1998
IEEE
14 years 27 days ago
CCS Resource Management in Networked HPC Systems
CCS is a resource management system for parallel high-performance computers. At the user level, CCS provides vendor-independent access to parallel systems. At the system administr...
Axel Keller, Alexander Reinefeld
ICECCS
2010
IEEE
159views Hardware» more  ICECCS 2010»
13 years 8 months ago
Towards Self-Healing Swarm Robotic Systems Inspired by Granuloma Formation
Abstract—Granuloma is a medical term for a ball-like collection of immune cells that attempts to remove foreign substances from a host organism. This response is a special type o...
Amelia Ritahani Ismail, Jon Timmis