Usual cache optimisation techniques for high performance computing are difficult to apply in embedded VLIW applications. First, embedded applications are not always well structur...
Samir Ammenouche, Sid Ahmed Ali Touati, William Ja...
—This paper explores the use of compiler optimizations which optimize the layout of instructions in memory. The target is to enable the code to make better use of the underlying ...
The overhead of copying data through the central processor by a message passing protocol limits data transfer bandwidth. If the network interface directly transfers the user'...
Hiroshi Tezuka, Francis O'Carroll, Atsushi Hori, Y...
Abstract. This paper is motivated by the desire to provide an efficient and scalable software cache implementation of OpenMP on multicore and manycore architectures in general, and...
Chen Chen, Joseph B. Manzano, Ge Gan, Guang R. Gao...
As multiprocessors are scaled beyond single bus systems, there is renewed interest in directory-based cache coherence schemes. These schemes rely on a directory to keep track of a...