Parallel programs that modify shared data in a cachecoherent multiprocessor with a write-invalidate coherence protocol create ownership overhead in the form of ownership acquisiti...
Abstract. Customizable processors augmented with application-specific Instruction Set Extensions (ISEs) have begun to gain traction in recent years. The most effective ISEs include...
Theo Kluter, Samuel Burri, Philip Brisk, Edoardo C...
Software distributed shared memory (DSM) techniques, while effective on applications with coarse-grained sharing, yield poor performance for the fine-grained sharing encountered i...
Bus-based shared memory multiprocessors with private caches and snooping write-invalidate cache coherence protocols are dominant form of small- to medium-scale parallel machines t...
We present a cross-layer customization methodology for latency and bandwidth efficient inter-core communication in embedded multiprocessors. The methodology integrates compiler, o...