Large scale compute clusters continue to grow to ever-increasing proportions. However, as clusters and applications continue to grow, the Mean Time Between Failures (MTBF) has redu...
As device scales shrink, higher transistor counts are available while soft-errors, even in logic, become a major concern. A new class of architectures, such as Merrimac and the IB...
Mattan Erez, Nuwan Jayasena, Timothy J. Knight, Wi...
Commercial designs are integrating from 10 to 100 embedded functional and storage blocks in a single system on chip (SoC) currently, and the number is likely to increase significa...
The importance of fault tolerance at the processor architecture level has been made increasingly important due to rapid advancements in the design and usage of high performance de...
T. S. Ganesh, Viswanathan Subramanian, Arun K. Som...
The Simultaneous Optical Multiprocessor Exchange Bus (SOME-Bus) is a low-latency, high-bandwidth interconnection network which directly links arbitrary pairs of processor nodes wit...