The memory consistency model in parallel programming controls the order in which operations performed by one thread may be observed by another. Language designers have been reluct...
Abstract. The register allocation in loops is generally performed after or during the software pipelining process. This is because doing a conventional register allocation at firs...
Stream processing represents an important class of applications that spans telecommunications, multimedia and the Internet. The implementation of streaming programs in FPGAs has a...
Andrei Hagiescu, Weng-Fai Wong, David F. Bacon, Ro...
We present a scalable, high-performance solution to multidimensional recurrences that arise in adaptive statistical designs. Adaptive designs are an important class of learning al...
Robert H. Oehmke, Janis Hardwick, Quentin F. Stout
We demonstrate Spiral, a domain-specific library generation system. Spiral generates high performance source code for linear transforms (such as the discrete Fourier transform and ...