Effective use of the memory hierarchy is critical for achieving high performance on embedded systems. We focus on the class of streaming applications, which is increasingly preval...
Janis Sermulins, William Thies, Rodric M. Rabbah, ...
We present a new VLIW core as a successor to the TriMedia TM1000. The processor is targeted for embedded use in media-processing devices like DTVs and set-top boxes. Intended as a...
Jos T. J. van Eijndhoven, Kees A. Vissers, Evert-J...
A reduction is a computation in which a common operation, such as a sum, is to be performed across multiple pieces of data, each supplied by a separate task. We introduce phaser a...
Jun Shirako, David M. Peixotto, Vivek Sarkar, Will...
Atomic blocks allow programmers to delimit sections of code as ‘atomic’, leaving the language’s implementation to enforce atomicity. Existing work has shown how to implement...
Timothy L. Harris, Mark Plesko, Avraham Shinnar, D...
This paper presents our progress on OpenVL - a novel software architecture to address efficiency through facilitating hardware acceleration, reusability and scalability for comput...