Sciweavers

147 search results - page 10 / 30
» Automatic recovery from software failure
Sort
View
SIGSOFT
2007
ACM
14 years 8 months ago
Fault and adversary tolerance as an emergent property of distributed systems' software architectures
Fault and adversary tolerance have become not only desirable but required properties of software systems because mission-critical systems are commonly distributed on large network...
Yuriy Brun, Nenad Medvidovic
PVM
2010
Springer
13 years 6 months ago
Dodging the Cost of Unavoidable Memory Copies in Message Logging Protocols
Abstract. With the number of computing elements spiraling to hundred of thousands in modern HPC systems, failures are common events. Few applications are nevertheless fault toleran...
George Bosilca, Aurelien Bouteiller, Thomas H&eacu...
ICSE
2009
IEEE-ACM
14 years 8 months ago
A toolset for automated failure analysis
Classic fault localization techniques can automatically provide information about the suspicious code blocks that are likely responsible for observed failures. This information is...
Fabrizio Pastore, Leonardo Mariani, Mauro Pezz&egr...
COMPSAC
2008
IEEE
13 years 9 months ago
Avoiding Program Failures Through Safe Execution Perturbations
We present an online framework to capture and recover from program failures and prevent them from occurring in the future through safe execution perturbations. The perturbations a...
Sriraman Tallam, Chen Tian, Rajiv Gupta, Xiangyu Z...
ICST
2009
IEEE
14 years 2 months ago
On the Effectiveness of Test Extraction without Overhead
Developers write and execute ad-hoc tests as they implement software. While these tests reflect important insights of the developers (e.g., which parts of the software need testi...
Andreas Leitner, Alexander Pretschner, Stefan Mori...