Next generation tiled microarchitectures are going to be limited by off-chip misses and by on-chip network usage. Furthermore, these platforms will run an heterogeneous mix of ap...
In this paper, we look at two issues which could affect the performance of value prediction on wide-issue ILP processors. One is the large number of accesses to the value predicti...
Communication in cache-coherent distributed shared memory (DSM) often requires invalidating (or writing back) cached copies of a memory block, incurring high overheads. This paper...
As compilers increasingly rely on optimizations to achieve high performance, the effectiveness of source level debuggers for optimized code continues to falter. Even if values of s...
The need to reuse the performance macromodels of an analog circuit topology challenges existing regression based modeling techniques. A model of good reusability should have a num...