With the widening performance gap between processors and main memory, efficient memory accessing behavior is necessary for good program performance. Loop partition is an effective...
Previous work on DRAM power-mode management focused on hardware-based techniques and compiler-directed schemes to explicitly transition unused memory modules to low-power operatin...
Victor Delaluz, Anand Sivasubramaniam, Mahmut T. K...
This paper introduces a compiler-orchestrated prefetching system as a unified framework geared toward ameliorating the gap between processing speeds and memory access latencies. ...
Rodric M. Rabbah, Hariharan Sandanagobalane, Mongk...
Caches are notorious for their unpredictability. It is difficult or even impossible to predict if a memory access results in a definite cache hit or miss. This unpredictability i...
Most programming languages support a call stack in the programming model and also in the runtime system. We show that for applications targeting low-power embedded microcontroller...