Sciweavers

482 search results - page 43 / 97
» A large-scale study of failures in high-performance computin...
Sort
View
151
Voted
DCOSS
2006
Springer
15 years 7 months ago
When Birds Die: Making Population Protocols Fault-Tolerant
In the population protocol model introduced by Angluin et al. [2], a collection of agents, which are modelled by finite state machines, move around unpredictably and have pairwise ...
Carole Delporte-Gallet, Hugues Fauconnier, Rachid ...
121
Voted
ICAC
2005
IEEE
15 years 9 months ago
A Mass Storage System Administrator Autonomic Assistant
System administrators of today’s high performance computing systems are generally responsible for managing the large amounts of data traffic and archival querying that mass stor...
Milton Halem, Randy Schauer
SEW
2005
IEEE
15 years 9 months ago
Using Visualization to Understand Dependability: A Tool Support for Requirements Analysis
Dealing with dependability requirements is a complex task for stakeholders and analysts as many different aspects of a system must be taken into account at the same time: services...
Paolo Donzelli, Daniel Hirschbach, Victor R. Basil...
186
Voted
HPDC
1999
IEEE
15 years 7 months ago
The Cactus Computational Toolkit and using Distributed Computing to Collide Neutron Stars
We are developing a system for collaborative research and development for a distributed group of researchers at different institutions around the world. In a new paradigm for coll...
Gabrielle Allen, Tom Goodale, Joan Massó, E...
102
Voted
SRDS
2006
IEEE
15 years 9 months ago
Topology Sensitive Replica Selection
As the disks typically found in personal computers grow larger, protecting data by replicating it on a collection of “peer” systems rather than on dedicated high performance s...
Dmitry Brodsky, Michael J. Feeley, Norman C. Hutch...