In this paper we compare the performance of area equivalent small, medium, and large-scale multithreaded chip multiprocessors (CMTs) using throughput-oriented applications. We use...
Numerous tools have been proposed to help developers fix software errors and inefficiencies. Widely-used techniques such as memory checking suffer from overheads that limit thei...
Joseph L. Greathouse, Hongyi Xin, Yixin Luo, Todd ...
Advanced Synchronization Facility (ASF) is an AMD64 hardware extension for lock-free data structures and transactional memory. It provides a speculative region that atomically exec...
Jae-Woong Chung, Luke Yen, Stephan Diestelhorst, M...
Instruction fetch behavior has been shown to be very regular and predictable, even for diverse application areas. In this work, we propose the Lookahead Instruction Fetch Engine (...
Stephen Roderick Hines, Yuval Peress, Peter Gavin,...
As we reach the limits of single-core computing, we are promised more and more cores in our systems. Modern architectures include many performance counters per core, but few or no...
Paul E. West, Yuval Peress, Gary S. Tyson, Sally A...