The design of large dependable multiprocessor systems requires quick and precise mechanisms for detecting the faulty nodes. The problem of system-level fault diagnosis is computati...
Supermon is a flexible set of tools for high speed, scalable cluster monitoring. Node behavior can be monitored much faster than with other commonly used methods (e.g., rstatd). ...
In this paper we propose the OPTNET, a novel optical network and associated coherence protocol for scalable multiprocessors. The network divides its channels into broadcast and po...
This paper deals with multiprocessor systems required to provide both high performance and good figures of dependability attributes. Fault tolerance is pursued through a proper co...
Felicita Di Giandomenico, Silvano Chiaradonna, And...
We introduce a fast and accurate technique for initializing the directory and cache state of a multiprocessor system based on a novel software structure called the memory timestam...
Kenneth C. Barr, Heidi Pan, Michael Zhang, Krste A...