In this paper we propose a simple extension to the I/O architecture of scalable multiprocessors that optimizes page swap-outs significantly. More specifically, we propose the use o...
The performance of applications on large shared-memory multiprocessors with coherent caches depends on the interaction between the granularity of data sharing, the size of the coh...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
We introduce a set of techniques to both measure and optimize memory access locality of Java applications running on cc-NUMA servers. These techniques work at the object level and...
Good spatial locality alleviates both the latency and bandwidth problem of memory by boosting the effect of prefetching and improving the utilization of cache. However, convention...
Xiaoming Gu, Ian Christopher, Tongxin Bai, Chengli...
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout ...