Future high-performance billion-transistor processors are likely to employ partitioned architectures to achieve high clock speeds, high parallelism, low design complexity, and low...
This paper presents results on a new approach to partitioning a modulo-scheduled loop for distributed execution on parallel clusters of functional units organized as a VLIW machin...
Marcio Merino Fernandes, Josep Llosa, Nigel P. Top...
Given the multicore microprocessor revolution, we argue that the architecture research community needs a dramatic increase in simulation capacity. We believe FPGA Architecture Mod...
Zhangxi Tan, Andrew Waterman, Henry Cook, Sarah Bi...
With the ever-increasing transistor variability in CMOS technology, it is essential to integrate variation-aware performance analysis into the task allocation and scheduling proce...
This paper presents the study of running several core multimedia applications on a simultaneous multithreading (SMT) architecture and derives design principles for multimedia soft...
Yen-Kuang Chen, Rainer Lienhart, Eric Debes, Matth...