Sciweavers

7271 search results - page 114 / 1455
» Fault-Tolerant Distributed Simulation
Sort
View
HCW
1998
IEEE
14 years 1 months ago
CCS Resource Management in Networked HPC Systems
CCS is a resource management system for parallel high-performance computers. At the user level, CCS provides vendor-independent access to parallel systems. At the system administr...
Axel Keller, Alexander Reinefeld
IPPS
2008
IEEE
14 years 3 months ago
VT-ASOS: Holistic system software customization for many cores
VT-ASOS is a framework for holistic and continuous customization of system software on HPC systems. The framework leverages paravirtualization technology. VT-ASOS extends the Xen ...
Dimitrios S. Nikolopoulos, Godmar Back, Jyotirmaya...
ESCIENCE
2006
IEEE
14 years 3 months ago
A Unified Data Grid Replication Framework
Modern scientific experiments can generate large amounts of data, which may be replicated and distributed across multiple resources to improve application performance and fault to...
Tim Ho, David Abramson
HPDC
2006
IEEE
14 years 3 months ago
Toward Self Organizing Grids
— The potential of truly large scale grids can only be realized with grid architectures and deployment strategies that lower the need for human administrative intervention, and t...
Nael B. Abu-Ghazaleh, Michael J. Lewis
IPPS
2006
IEEE
14 years 3 months ago
An advanced performance analysis of self-stabilizing protocols: stabilization time with transient faults during convergence
A self-stabilizing protocol is a brilliant framework for fault tolerance. It can recover from any number and any type of transient faults and eventually converge to its intended b...
Yoshihiro Nakaminami, Hirotsugu Kakugawa, Toshimit...