– Efficient implementations of the Discrete Fourier Transform (DFT) for GPUs provide good performance with large data sizes, but are not competitive with CPU code for small data ...
Compiler optimizations are often driven by specific assumptions about the underlying architecture and implementation of the target machine. For example, when targeting shared-mem...
Jack L. Lo, Susan J. Eggers, Henry M. Levy, Sujay ...
We present a context- and path-sensitive algorithm for detecting memory leaks in programs with explicit memory management. Our leak detection algorithm is based on an underlying e...
The Merrimac supercomputer uses stream processors and a highradix network to achieve high performance at low cost and low power. The stream architecture matches the capabilities o...
Mattan Erez, Jung Ho Ahn, Ankit Garg, William J. D...
This paper describes the concept, architecture, development and demonstration of a real time, high performance, software defined 4-receiver system and a space time decoder to be i...