We describe Neutron, a version of the TinyOS operating system that efficiently recovers from memory safety bugs. Where existing schemes reboot an entire node on an error, Neutron...
Yang Chen, Omprakash Gnawali, Maria A. Kazandjieva...
Faulty device drivers cause significant damage through down time and data loss. The problem can be mitigated by an improved driver development process that guarantees correctness...
Leonid Ryzhyk, Peter Chubb, Ihor Kuz, Etienne Le S...
We revisit the problem of scaling software routers, motivated by recent advances in server technology that enable highspeed parallel processing—a feature router workloads appear...
Mihai Dobrescu, Norbert Egi, Katerina J. Argyraki,...
Reproducing bugs is hard. Deterministic replay systems address this problem by providing a high-fidelity replica of an original program run that can be repeatedly executed to zer...
Helios is an operating system designed to simplify the task of writing, deploying, and tuning applications for heterogeneous platforms. Helios introduces satellite kernels, which ...
Edmund B. Nightingale, Orion Hodson, Ross McIlroy,...
We present ClearView, a system for automatically patching errors in deployed software. ClearView works on stripped Windows x86 binaries without any need for source code, debugging...
Jeff H. Perkins, Sunghun Kim, Samuel Larsen, Saman...
The UpRight library seeks to make Byzantine fault tolerance (BFT) a simple and viable alternative to crash fault tolerance for a range of cluster services. We demonstrate UpRight ...
Allen Clement, Manos Kapritsos, Sangmin Lee, Yang ...
Bug reproduction is critically important for diagnosing a production-run failure. Unfortunately, reproducing a concurrency bug on multi-processors (e.g., multi-core) is challengin...
Soyeon Park, Yuanyuan Zhou, Weiwei Xiong, Zuoning ...
RESIN is a new language runtime that helps prevent security vulnerabilities, by allowing programmers to specify application-level data flow assertions. RESIN provides policy obje...
Alexander Yip, Xi Wang, Nickolai Zeldovich, M. Fra...
Windows Error Reporting (WER) is a distributed system that automates the processing of error reports coming from an installed base of a billion machines. WER has collected billion...
Kirk Glerum, Kinshuman Kinshumann, Steve Greenberg...