Abstract. Loop fusion is a program transformation that merges multiple loops into one. It is e ective for reducing the synchronization overhead of parallel loops and for improving ...
Abstract. Inline-threaded interpretation is a recent technique that improves performance by eliminating dispatch overhead within basic blocks for interpreters written in C [11]. Th...
We propose signature-accelerated transactional memory (SigTM), a hybrid TM system that reduces the overhead of software transactions. SigTM uses hardware signatures to track the r...
Chi Cao Minh, Martin Trautmann, JaeWoong Chung, Au...
DSP processors usually provide dedicated address generation units (AGUs) to assist address computation. By carefully allocating variables in the memory, DSP compilers take advanta...
We propose a new instruction, branch-on-random, that is like a standard conditional branch, except rather than specifying the condition on which the branch should be taken, it spe...