Sciweavers

114 search results - page 9 / 23
» Speculative Parallelization - Eliminating the Overhead of Fa...
Sort
View
CCGRID
2006
IEEE
14 years 1 months ago
Exploit Failure Prediction for Adaptive Fault-Tolerance in Cluster Computing
As the scale of cluster computing grows, it is becoming hard for long-running applications to complete without facing failures on large-scale clusters. To address this issue, chec...
Yawei Li, Zhiling Lan
ICDCS
1996
IEEE
13 years 11 months ago
How to Recover Efficiently and Asynchronously when Optimism Fails
We propose a new algorithm for recovering asynchronously from failures in a distributed computation. Our algorithm is based on two novel concepts - a fault-tolerant vector clock t...
Om P. Damani, Vijay K. Garg
ICDCS
2008
IEEE
14 years 2 months ago
Toward Predictive Failure Management for Distributed Stream Processing Systems
Distributed stream processing systems (DSPSs) have many important applications such as sensor data analysis, network security, and business intelligence. Failure management is ess...
Xiaohui Gu, Spiros Papadimitriou, Philip S. Yu, Sh...
IPPS
2010
IEEE
13 years 5 months ago
Scalable failure recovery for high-performance data aggregation
Many high-performance tools, applications and infrastructures, such as Paradyn, STAT, TAU, Ganglia, SuperMon, Astrolabe, Borealis, and MRNet, use data aggregation to synthesize lar...
Dorian C. Arnold, Barton P. Miller
ICCD
2006
IEEE
97views Hardware» more  ICCD 2006»
14 years 4 months ago
Pesticide: Using SMT Processors to Improve Performance of Pointer Bug Detection
Pointer bugs associated with dynamically-allocated objects resulting in out-of-bounds memory access are an important class of software bugs. Because such bugs cannot be detected e...
Jin-Yi Wang, Yen-Shiang Shue, T. N. Vijaykumar, Sa...