—Many application-specific architectures provide indirect addressing modes with auto-increment/decrement arithmetic. Since these architectures generally do not feature an indexe...
This paper presents recursion unrolling, a technique for improving the performance of recursive computations. Conceptually, recursion unrolling inlines recursive calls to reduce c...
Abstract—Load elimination is a classical compiler transformation that is increasing in importance for multi-core and many-core architectures. The effect of the transformation is ...
Complex shaders must be partitioned into multiple passes to execute on GPUs with limited hardware resources. Automatic partitioning gives rise to an NP-hard scheduling problem tha...
Register allocation is often a two-phase approach: spilling of registers to memory, followed by coalescing of registers. Extreme liverange splitting (i.e. live-range splitting aft...