This paper presents specialized code generation techniques and runtime optimizations for developing light-weight XML Web services for embedded devices. The optimizations are imple...
Compiler technology for multimedia extensions must effectively utilize not only the SIMD compute engines but also the various levels of the memory hierarchy: superword registers,...
Chun Chen, Jaewook Shin, Shiva Kintali, Jacqueline...
In this paper, we describe an algorithm and implementation of locality optimizations for architectures with instruction sets such as Intel’s SSE and Motorola’s AltiVec that su...
- We propose a near optimal hardware architecture for deblocking filter in H.264/MPEG-4 AVC. We propose a novel filtering order and a data reuse strategy that result in significant...
This paper presents a new modulo scheduling algorithm for clustered microarchitectures. The main feature of the proposed scheme is that the assignment of instructions to clusters ...