Concurrency control is one of the main sources of error and complexity in shared memory parallel programming. While there are several techniques to handle concurrency control such...
Luis Ceze, Christoph von Praun, Calin Cascaval, Pa...
We develop an availability solution, called SafetyNet, that uses a unified, lightweight checkpoint/recovery mechanism to support multiple long-latency fault detection schemes. At...
Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill, ...
With the current trend in integration of more complex systems on chip there is a need for better communication infrastructure on chip that will increase the available bandwidth an...
The high degree of complexity and autonomy of future robotic space missions, such as Mars Science Laboratory (MSL), poses serious challenges in assuring their reliability and ef...
It is well known that in a typical real-time system, certain parameters, such as the execution time of a job, are not fixed numbers. In such systems, it is common to characterize ...