Abstract--After the seminal work by Taqqu et al. relating selfsimilarity to heavy-tailed distributions, a number of research articles verified that aggregated Internet traffic time...
This paper describes a comprehensive prototype of large-scale fault adaptive embedded software developed for the proposed Fermilab BTeV high energy physics experiment. Lightweight...
Derek Messie, Mina Jung, Jae C. Oh, Shweta Shetty,...
Large-scale systems like BlueGene/L are susceptible to a number of software and hardware failures that can affect system performance. Periodic application checkpointing is a commo...
As the scale is expanding, node failure becomes a commonplace feature of large-scale cluster systems. As an important part of cluster operating system software, job scheduling tak...
Linping Wu, Dan Meng, Jianfeng Zhan, Wang Lei, Bib...
Peer-to-peer computing and networking, an emerging model of communication and computation, has recently started to gain significant acceptance. This model not only enables client...