Data prefetching technique is widely used to bridge the growing performance gap between processor and memory. Numerous prefetching techniques have been proposed to exploit data pa...
A common approach for dealing with large data sets is to stream over the input in one pass, and perform computations using sublinear resources. For truly massive data sets, howeve...
Jon Feldman, S. Muthukrishnan, Anastasios Sidiropo...
SIMD (Single Instruction, Multiple Data) engines are an essential part of the processors in various computing markets, from servers to the embedded domain. Although SIMD-enabled a...
Amir Hormati, Yoonseo Choi, Mark Woh, Manjunath Ku...
As more complex DSP algorithms are realized in practice, an increasing need for high-level stream abstractions that can be compiled without sacrificing efficiency. Toward this en...
Andrew A. Lamb, William Thies, Saman P. Amarasingh...
We present a case study parallelizing streaming aggregation on three different parallel hardware architectures. Aggregation is a performance-critical operation for data summarizat...
Scott Schneider, Henrique Andrade, Bugra Gedik, Ku...