Soft-error induced reliability problems have become a major challenge in designing new generation microprocessors. Due to the on-chip caches' dominant share in die area and transistor budget, protecting them against soft errors is of paramount importance. Recent research has focused on the design of cost-effective reliable data caches in terms of performance, energy, and area overheads, based on the assumption of fixed error rates. However, for systems in operating environments that vary with time or location, those schemes will be either insufficient or overdesigned for the changing error rates. In this paper, we explore the design of a self-adaptive reliable data cache that dynamically adapts its employed reliability schemes to the changing operating environments thus to maintain a target reliability. The proposed data cache is implemented with three levels of error protection schemes, a monitoring mechanism, and a control component that decides whether to upgrade, downgrade, or...
Shuai Wang, Jie S. Hu, Sotirios G. Ziavras