Previous research has shown that Staged Execution (SE), i.e., dividing a program into segments and executing each segment at the core that has the data and/or functionality to bes...
The design and evaluation of microprocessor architectures is a difficult and time-consuming task. Although small, handcoded microbenchmarks can be used to accelerate performance e...
Increasing demand for performance and efficiency has driven the computer industry toward multicore systems. These systems have become the industry standard in almost all segments...
Amir Hormati, Yoonseo Choi, Manjunath Kudlur, Rodr...
This paper presents a novel cycle-approximate performance estimation technique for automatically generated transaction level models (TLMs) for heterogeneous multicore designs. The...
Graphics processors (GPUs) provide a vast number of simple, data-parallel, deeply multithreaded cores and high memory bandwidths. GPU architectures are becoming increasingly progr...
Shuai Che, Michael Boyer, Jiayuan Meng, David Tarj...