In this paper we describe a new technique, called pipeline spectroscopy, and use it to measure the cost of each cache miss. The cost of a miss is displayed (graphed) as a histogra...
Thomas R. Puzak, Allan Hartstein, Philip G. Emma, ...
Recent superscalar processors issue four instructions per cycle. These processors are also powered by highly-parallel superscalar cores. The potential performance can only be expl...
Thomas M. Conte, Kishore N. Menezes, Patrick M. Mi...
We study the problem of sorting on a parallel computer with limited communication bandwidth. By using the PRAM(m) model, where p processors communicate through a globally shared me...
—Duplication and comparison has proven to be an efficient method for error detection. Based on this generic principle dual core processor architectures with output comparison ar...
DSP processors usually provide dedicated address generation units (AGUs) to assist address computation. By carefully allocating variables in the memory, DSP compilers take advanta...