Thread escape analysis conservatively determines which objects may be accessed in more than one thread. Thread escape analysis is useful for a variety of purposes – finding rac...
We present and evaluate the idea of adaptive processor cache management. Specifically, we describe a novel and general scheme by which we can combine any two cache management alg...
Ranjith Subramanian, Yannis Smaragdakis, Gabriel H...
Reuse distance (i.e. LRU stack distance) precisely characterizes program locality and has been a basic tool for memory system research since the 1970s. However, the high cost of m...
Xipeng Shen, Jonathan Shaw, Brian Meeker, Chen Din...
Abstract—The increasing number of cores per node in highperformance computing requires an efficient intra-node MPI communication subsystem. Most existing MPI implementations rel...
There has been intensive research on data prefetching focusing on performance improvement, however, the energy aspect of prefetching is relatively unknown. Our experiments show th...
Yao Guo, Saurabh Chheda, Israel Koren, C. Mani Kri...