Sciweavers

647 search results - page 101 / 130
» Simulating Failures on Large-Scale Systems
Sort
View
SIGSOFT
2010
ACM
13 years 5 months ago
Finding latent performance bugs in systems implementations
Robust distributed systems commonly employ high-level recovery mechanisms enabling the system to recover from a wide variety of problematic environmental conditions such as node f...
Charles Edwin Killian, Karthik Nagaraj, Salman Per...
RTAS
2010
IEEE
13 years 6 months ago
Feedback Thermal Control for Real-time Systems
—Thermal control is crucial to real-time systems as excessive processor temperature can cause system failure or unacceptable performance degradation due to hardware throttling. R...
Yong Fu, Nicholas Kottenstette, Yingming Chen, Che...
DEDS
2002
119views more  DEDS 2002»
13 years 7 months ago
Diagnosing Discrete-Event Systems: Extending the "Diagnoser Approach" to Deal with Telecommunication Networks
Abstract. Detection and isolation of failures in large and complex systems such as telecommunication networks are crucial and challenging tasks. The problem considered here is that...
Laurence Rozé, Marie-Odile Cordier
MSS
2007
IEEE
105views Hardware» more  MSS 2007»
14 years 2 months ago
Quota enforcement for high-performance distributed storage systems
Storage systems manage quotas to ensure that no one user can use more than their share of storage, and that each user gets the storage they need. This is difficult for large, dis...
Kristal T. Pollack, Darrell D. E. Long, Richard A....
HASE
1998
IEEE
14 years 1 days ago
Combining Various Solution Techniques for Dynamic Fault Tree Analysis of Computer Systems
Fault trees provide a graphical and logical framework for analyzing the reliability of systems. A fault tree provides a conceptually simple modeling framework to represent the sys...
Ragavan Manian, Joanne Bechta Dugan, David Coppit,...