This paper describes an experimental study of Linux kernel behavior in the presence of errors that impact the instruction stream of the kernel code. Extensive error injection exper...
Weining Gu, Zbigniew Kalbarczyk, Ravishankar K. Iy...
A Resilient Overlay Network (RON) is an architecture that allows distributed Internet applications to detect and recover from path outages and periods of degraded performance with...
David G. Andersen, Hari Balakrishnan, M. Frans Kaa...
This paper discusses a formal and rigorous approach to the analysis of operator interaction with machines. It addresses the acute problem of detecting design errors in human-machi...
Streamlining communication is key to achieving good performance in shared-memory parallel programs. While full hardware support for cache coherence generally offers the best perfo...
Some software defects trigger failures only when certain complex information flows occur within the software. Profiling and analyzing such flows therefore provides a potentially i...