This paper shows that even very small data caches, when split to serve data streams exhibiting temporal and spatial localities, can improve performance of embedded applications without consuming excessive silicon real estate or power. It also shows that large block sizes or higher set-associativities are unnecessary with split cache organizations. We use benchmark programs from MiBench to show that our cache organization outperforms other organizations in terms of miss rates, access times, energy consumption and silicon area.
Afrin Naz, Krishna M. Kavi, Wentong Li, Philip H.