Fetching instructions from a set-associative cache in an embedded processor can consume a large amount of energy due to the tag checks performed. Recent proposals to address this ...
Timothy M. Jones, Sandro Bartolini, Bruno De Bus, ...
Conventional processors use a fully-associative store queue (SQ) to implement store-load forwarding. Associative search latency does not scale well to capacities and bandwidths re...
Conventional prefetching schemes regard prediction accuracy as important because useless data prefetched by a faulty prediction may pollute the cache. If prefetching requires cons...
As applications tend to grow more complex and use more memory, the demand for cache space increases. Thus embedded processors are inclined to use larger caches. Predicting a miss i...
Cache blocks often exhibit a small number of uses during their life time in the last-level cache. Past research has exploited this property in two different ways. First, replacem...