Heterogeneous multicores, such as Cell BE processors and GPGPUs, typically do not have caches for their accelerator cores because coherence traffic, cache misses, and latencies fr...
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
This work is based on our philosophy of providing interlayer system-level power awareness in computing systems [26, 27]. Here, we couple this approach with our vision of multipart...
Osman S. Unsal, Israel Koren, C. Mani Krishna, Csa...
Recently, multi-core architectures with alternative memory subsystem designs have emerged. Instead of using hardwaremanaged cache hierarchies, they employ software-managed embedde...
In this paper, we describe a compilation system that automates much of the process of performance tuning that is currently done manually by application programmers interested in h...
Nastaran Baradaran, Jacqueline Chame, Chun Chen, P...