Middleware for parallel and distributed systems is designed to virtualize computation and communication resources so that a more and consistent view of those resources is presente...
High-accuracy PDE solvers use multi-dimensional fast Fourier transforms. The FFTs exhibits a static and structured memory access pattern which results in a large amount of communic...
Estimating the maximum power and thermal characteristics of a processor is essential for designing its power delivery system, packaging, cooling, and power/thermal management sche...
Ajay M. Joshi, Lieven Eeckhout, Lizy Kurian John, ...
Hiding communication latency is an important optimization for parallel programs. Programmers or compilers achieve this by using non-blocking communication primitives and overlappi...
The speedup over a microprocessor that can be achieved by implementing some programs on an FPGA has been extensively reported. This paper presents an analysis, both quantitative a...
Zhi Guo, Walid A. Najjar, Frank Vahid, Kees A. Vis...