Due to the shrinking of feature size and reduction in supply voltages, nanoscale circuits have become more susceptible to radiation induced transient faults. In this paper, we pre...
—With increasing numbers of cores, future CMPs (Chip Multi-Processors) are likely to have a tiled architecture with a portion of shared L2 cache on each tile and a bankinterleave...
Abstract. One of the most important collective communication patterns used in scientific applications is the complete exchange, also called All-to-All. Although efficient algorithm...
Luiz Angelo Steffenel, Maxime Martinasso, Denis Tr...
SARC merges cache controller and network interface functions by relying on a single hardware primitive: each access checks the tag and the state of the addressed line for possible...
Traditional performance models are too brittle to be relied on for continuous capacity planning and performance debugging in many computer systems. Simply put, a brittle model is ...