Most parallel machines, such as clusters, are spaceshared in order to isolate batch parallel applications from each other and optimize their performance. However, this leads to lo...
Debugging real systems is hard, requires deep knowledge of the code, and is time-consuming. Bug reports rarely provide sufficient information, thus forcing developers to turn int...
As technology scales and the energy of computation continually approaches thermal equilibrium [1,2], parameter variations and noise levels will lead to larger error rates at vario...
Designing cyber-physical systems with high efficiency, adaptability, autonomy, reliability and usability is a challenging task. In this paper, we focus on minimizing networkwide ...
Chun Jason Xue, Guoliang Xing, Zhaohui Yuan, Zili ...
We describe a system for aggregating heterogeneous resources from distinct administrative domains into an enterprise-wide compute grid, such that the aggregated resource provides ...