—The design of scalable and reliable interconnection networks for multicore chips (NoCs) introduces new design constraints like power consumption, area, and ultra low latencies. ...
Abstract—Indirect jumps are used to implement increasinglycommon programming language constructs such as virtual function calls, switch-case statements, jump tables, and interfac...
Current on-chip block-centric memory hierarchies exploit access patterns at the fine-grain scale of small blocks. Several recently proposed memory hierarchy enhancements for coher...
Abstract—Some processors designed for consumer applications, such as Graphics Processing Units (GPUs) and the CELL processor, promise outstanding floating-point performance for ...
—Several recently proposed techniques including CPR (Checkpoint Processing and Recovery) and NoSQ (No Store Queue) rely on reference counting to manage physical registers. Howeve...
—This paper studies the important interaction between parallelization and energy consumption in a parallelizable application. Given the ratio of serial and parallel portion in an...
—As transistor feature sizes continue to shrink into the sub-90nm range and beyond, the effects of process variations on critical path delay and chip yields have amplified. A com...
This paper evaluates asymmetric cluster chip multiprocessor (ACCMP) architectures as a mechanism to achieve the highest performance for a given power budget. ACCMPs execute serial ...
T. Y. Morad, Uri C. Weiser, A. Kolodnyt, Mateo Val...
Abstract-- Cycle accurate simulation has long been the primary tool for micro-architecture design and evaluation. Though accurate, the slow speed often imposes constraints on the e...