Programmers and users of compute intensive scientific applications often do not want to (or even cannot) code load balancing and fault tolerance into their programs. The PBEAM syst...
This paper describes a study performed in an industrial setting that attempts to build predictive models to identify parts of a Java system with a high probability of fault. The s...
A proactive handling of faults requires that the risk of upcoming failures is continuously assessed. One of the promising approaches is online failure prediction, which means that...
A game-theoretic model for studying power control in multicarrier code-division multiple-access systems is proposed. Power control is modeled as a noncooperative game in which each...
Farhad Meshkati, Mung Chiang, H. Vincent Poor, Stu...
This paper presents a method to optimize the timeout value of computing jobs. It relies on a model of the job execution time that considers the job management system latency throu...