Sciweavers

1119 search results - page 36 / 224
» Computing in the Presence of Timing Failures
Sort
View
SRDS
2007
IEEE
14 years 3 months ago
Using Hidden Semi-Markov Models for Effective Online Failure Prediction
A proactive handling of faults requires that the risk of upcoming failures is continuously assessed. One of the promising approaches is online failure prediction, which means that...
Felix Salfner, Miroslaw Malek
SIGMETRICS
1998
ACM
13 years 8 months ago
Internet service performance failure detection
The increasing complexity of computer networks and our increasing dependence on them means enforcing reliability requirements is both more challenging and more critical. The expan...
Amy R. Ward, Peter W. Glynn, Kathy J. Richardson
JSSPP
2004
Springer
14 years 2 months ago
Performance Implications of Failures in Large-Scale Cluster Scheduling
As we continue to evolve into large-scale parallel systems, many of them employing hundreds of computing engines to take on mission-critical roles, it is crucial to design those s...
Yanyong Zhang, Mark S. Squillante, Anand Sivasubra...
SBACPAD
2005
IEEE
111views Hardware» more  SBACPAD 2005»
14 years 2 months ago
VRM: A Failure-Aware Grid Resource Management System
Abstract— For resource management in Grid environments, advance reservations turned out to be very useful and hence are supported by a variety of Grid toolkits. However, failure ...
Lars-Olof Burchard, César A. F. De Rose, Ha...
PODC
2009
ACM
14 years 4 months ago
Fast scalable deterministic consensus for crash failures
We study communication complexity of consensus in synchronous message-passing systems with processes prone to crashes. The goal in the consensus problem is to have all the nonfaul...
Bogdan S. Chlebus, Dariusz R. Kowalski, Michal Str...