Speculative pre-execution achieves efficient data prefetching by running additional prefetching threads on spare hardware contexts. Various implementations for speculative pre-exe...
Abstract. We describe the structure of a compilation system that generates code for processor architectures supporting both explicit and implicit parallel threads. Such architectur...
Traditional vector architectures have shown to be very effective for regular codes where the compiler can detect data-level parallelism. However, this SIMD parallelism is also pre...
The advent of multicores presents a promising opportunity for exploiting fine grained parallelism present in programs. Programs parallelized in the above fashion, typically involv...
Automatic parallelization is a promising strategy to improve application performance in the multicore era. However, common programming practices such as the reuse of data structur...
Nick P. Johnson, Hanjun Kim, Prakash Prabhu, Ayal ...