Current integration trends embrace the prosperity of single-chip multi-core processors. Although multi-core processors deliver significantly improved system throughput, single-thr...
VLIW architectures are popular in embedded systems because they offer high-performance processing at low cost and energy. The major problem with traditional VLIW designs is that t...
Hongtao Zhong, Kevin Fan, Scott A. Mahlke, Michael...
Dynamic optimization has the potential to adapt the program’s behavior at run-time to deliver performance improvements over static optimization. Dynamic optimization systems usu...
While general-purpose processors have only recently employed chip multiprocessor (CMP) architectures, network processors (NPs) have used heterogeneous multi-core architectures sin...
Coherence misses in shared-memory multiprocessors account for a substantial fraction of execution time in many important scientific and commercial workloads. Memory streaming prov...
Thomas F. Wenisch, Stephen Somogyi, Nikolaos Harda...
We describe the design, generation and compression of the extended whole program path (eWPP) representation that not only captures the control flow history of a program execution...
Helper threading is a technique that utilizes a second core or logical processor in a multi-threaded system to improve the performance of the main thread. A helper thread executes...
We propose a checkpoint store compression method for coarse-grain giga-scale checkpoint/restore. This mechanism can be useful for debugging, post-mortem analysis and error recover...