Sciweavers

647 search results - page 84 / 130
» Simulating Failures on Large-Scale Systems
Sort
View
SIGCOMM
2004
ACM
14 years 1 months ago
A scalable distributed information management system
We present a Scalable Distributed Information Management System (SDIMS) that aggregates information about large-scale networked systems and that can serve as a basic building bloc...
Praveen Yalagandula, Michael Dahlin
ISCA
2005
IEEE
79views Hardware» more  ISCA 2005»
14 years 1 months ago
Design and Evaluation of Hybrid Fault-Detection Systems
As chip densities and clock rates increase, processors are becoming more susceptible to transient faults that can affect program correctness. Up to now, system designers have prim...
George A. Reis, Jonathan Chang, Neil Vachharajani,...
MR
2007
173views Robotics» more  MR 2007»
13 years 7 months ago
A maintenance planning and business case development model for the application of prognostics and health management (PHM) to ele
- This paper presents a model that enables the optimal interpretation of Prognostics and Health Management (PHM) results for electronic systems. In this context, optimal interpreta...
Peter A. Sandborn, Chris Wilkinson
IPPS
2006
IEEE
14 years 1 months ago
Towards building a highly-available cluster based model for high performance computing
In recent years, we have witnessed a growing interest in high performance computing (HPC) using a cluster of workstations. However, many challenges remain to be resolved before th...
Azzedine Boukerche, Raed Al-Shaikh, Mirela Sechi M...
CDC
2010
IEEE
101views Control Systems» more  CDC 2010»
13 years 2 months ago
Performance-oriented communication topology design for large-scale interconnected systems
Abstract-- Communication networks provide a larger flexibility with respect to the control design of large-scale interconnected systems by allowing the information exchange between...
Azwirman Gusrialdi, Sandra Hirche