In the embedded domain, custom hardware in the form of ASICs is often used to implement critical parts of applications when performance and energy efficiency goals cannot be met ...
Kevin Fan, Hyunchul Park, Manjunath Kudlur, Scott ...
This paper describes an algorithm that takes a trace (i.e., a sequence of numbers or vectors of numbers) as input, and from that produces a sequence of loop nests that, when run, ...
Array inlining expands the concepts of object inlining to arrays. Groups of objects and arrays that reference each other are placed consecutively in memory so that their relative ...
The recent trend in the processor industry of packing multiple processor cores in a chip has increased the importance of automatic techniques for extracting thread level paralleli...
Easwaran Raman, Neil Vachharajani, Ram Rangan, Dav...
Modern compilers implement a large number of optimizations which all interact in complex ways, and which all have a different impact on code quality, compilation time, code size,...
Vector-thread (VT) architectures exploit multiple forms of parallelism simultaneously. This paper describes a compiler for the Scale VT architecture, which takes advantage of the ...