This paper presents a framework for analyzing the performance of multithreaded programs using a model called a constraint graph. We review previous constraint graph definitions fo...
Sequential consistency (SC) is the simplest programming interface for shared-memory systems but imposes program order among all memory operations, possibly precluding high perform...
As the performance gap between the CPU and main memory continues to grow, techniques to hide memory latency are essential to deliver a high performance computer system. Prefetchin...
New fusion memory devices consisting of multiple heterogeneous memory components in a single die or package offer efficient ways to optimize embedded systems in terms of energy, pe...
Yongsoo Joo, Yongseok Choi, Jaehyun Park, Chanik P...
Branches that depend directly or indirectly on load instructions are a leading cause of mispredictions by state-of-the-art branch predictors. For a branch of this type, there is a...