We present a new arrangement of directory bits called the segment directory to improve directory storage efficiency: a segment directory can point to several sharing processors wi...
There are two basic models for the on-chip memory in CMP systems: hardware-managed coherent caches and software-managed streaming memory. This paper performs a direct comparison o...
Jacob Leverich, Hideho Arakida, Alex Solomatnikov,...
—Performance degradation of memory-intensive programs caused by the LRU policy’s inability to handle weaklocality data accesses in the last level cache is increasingly serious ...
We consider the formal verification of the cache coherence protocol of the Stanford FLASH multiprocessor for N processors. The proof uses the SMV proof assistant, a proof system ba...
Run-time parallelization is often the only way to execute the code in parallel when data dependence information is incomplete at compile time. This situation is common in many imp...