Due to shared cache contentions and interconnect delays, data prefetching is more critical in alleviating penalties from increasing memory latencies and demands on Chip-Multiproce...
Xudong Shi, Zhen Yang, Jih-Kwon Peir, Lu Peng, Yen...
Many nonblocking algorithms have been proposed for shared queues. Previous studies indicate that link-based algorithms perform best. However, these algorithms have a memory manage...
Pomegranate is a parallel hardware architecture for polygon rendering that provides scalable input bandwidth, triangle rate, pixel rate, texture memory and display bandwidth while...
This paper compares the performance of hierarchical ring- and mesh-connected wormhole routed shared memory multiprocessor networks in a simulation study. Hierarchical rings are in...
The least recently used (LRU) replacement policy performs poorly in the last-level cache (LLC) because temporal locality of memory accesses is filtered by first and second level...
Samira Manabi Khan, Zhe Wang, Daniel A. Jimé...