Cache-oblivious algorithms have the advantage of achieving good sequential cache complexity across all levels of a multi-level cache hierarchy, regardless of the specifics (cache...
Guy E. Blelloch, Phillip B. Gibbons, Harsha Vardha...
Remote Direct Memory Access (RDMA) is a mechanism whereby data is moved directly between the application memory of the local and remote computer. In bypassing the operating system...
Tiling is a crucial loop transformation for generating high performance code on modern architectures. Efficient generation of multilevel tiled code is essential for maximizing da...
Albert Hartono, Muthu Manikandan Baskaran, C&eacut...
—Current implementations of real-time collaborative applications rely on a dedicated infrastructure to carry out all synchronizing and communication functions, and require all en...
Krzysztof Rzadca, Jackson Tan Teck Yong, Anwitaman...
—Analytical models have been used to estimate optimal values for parameters such as tile sizes in the context of loop nests. However, important algorithms such as fast Fourier tr...
Basilio B. Fraguela, Yevgen Voronenko, Markus P&uu...