Streamlining communication is key to achieving good performance in shared-memory parallel programs. While full hardware support for cache coherence generally offers the best perfo...
Simultaneous Multithreading (SMT) is a technology aimed at improving the throughput of the processor core by applying Instruction Level Parallelism (ILP) and Thread Level Parallel...
Previous work has shown that cache line sizes impact performance differently for different desktop programs – some programs work better with small line sizes, others with larger...