There has recently been increasing interests in using system virtualization to improve the dependability of HPC cluster systems. However, it is not cost-free and may come with som...
Haibo Chen, Rong Chen, Fengzhe Zhang, Binyu Zang, ...
In this demonstration we present BRRL, a library for making distributed main-memory applications fault tolerant. BRRL is optimized for cloud applications with frequent points of c...
Tuan Cao, Benjamin Sowell, Marcos Antonio Vaz Sall...
Under sponsorship of the Defense Advanced Research Projects Agency’s (DARPA) Fault Tolerant Networks (FTN) program, The Johns Hopkins University Applied Physics Laboratory (JHU/...
W. J. Blackert, D. M. Gregg, A. K. Castner, E. M. ...
Reliable storage of data with concurrent read/write accesses (or query/update) is an ever recurring issue in distributed settings. In mobile ad hoc networks, the problem becomes e...
The Network Weather Service NWS is a distributed resource monitoring and utilization prediction system, employed as an aid to scheduling jobs in a metacomputing environment 9, 1...
Robert E. Busby Jr., Mitchell L. Neilsen, Daniel A...