As multiprocessor systems-on-chip become a reality, performance modeling becomes a challenge. To quickly evaluate many architectures, some type of high-level simulation is require...
Joshua J. Pieper, Alain Mellan, JoAnn M. Paul, Don...
Cache locality optimization is an efficient way for reducing the idle time of modern processors in waiting for needed data. This kind of optimization can be achieved either on the...
This paper explores the architectural implications of integrating computation and molecular probes to form nanoscale sensor processors (nSP). We show how nSPs may enable new compu...
Constantin Pistol, Christopher Dwyer, Alvin R. Leb...
Today's commodity microprocessors require a low latency memory system to achieve high sustained performance. The conventional high-performance memory system provides fast dat...
Communication in cache-coherent distributed shared memory (DSM) often requires invalidating (or writing back) cached copies of a memory block, incurring high overheads. This paper...