Compilation has a long history of translating a programmer's human-readable code into machine instructions designed to make good use of a specific target computer. In this pa...
This paper describes the implementation of a runtime library for asynchronous communication in the Cell BE processor. The runtime library implementation provides with several servi...
This paper presents a new superscalar architecture for fast discrete cosine transform (DCT). Comparing with the general SIMD architecture, it speeds up the DCT computation by a fac...
Compiler technology for multimedia extensions must effectively utilize not only the SIMD compute engines but also the various levels of the memory hierarchy: superword registers,...
Chun Chen, Jaewook Shin, Shiva Kintali, Jacqueline...
Graphics and media processing is quickly emerging to become one of the key computing workloads. Programmable graphics processors give designers extra flexibility by running a sma...
Mauricio Breternitz Jr., Herbert H. J. Hum, Sanjee...