Concurrent multithreaded architectures exploit both instruction-level and thread-level parallelism through a combination of branch prediction and thread-level control speculation. ...
Instruction combining is an optimization to replace a sequence of instructions with a more efficient instruction yielding the same result in a fewer machine cycles. When we use it...
In this paper, we propose a technique that uses an additional mini cache, the L0-Cache, located between the instruction cache I-Cache and the CPU core. This mechanism can provid...
Nikolaos Bellas, Ibrahim N. Hajj, Constantine D. P...
Multi-core chips have been increasingly adopted by microprocessor industry. For real-time systems to safely harness the potential of multi-core computing, designers must be able t...
Communication in cache-coherent distributed shared memory (DSM) often requires invalidating (or writing back) cached copies of a memory block, incurring high overheads. This paper...