Two broad classes of memory models are available today: models with hardware cache coherence, used in conventional chip multiprocessors, and models that rely upon software to mana...
John H. Kelm, Daniel R. Johnson, William Tuohy, St...
Modern microprocessor designs continue to obtain impressive performance gains through increasing clock rates and advances in the parallelism obtained via micro-architecture design...
This paper evaluates how much extended dictionary-based code compression techniques can reduce the static code size. In their simplest form, such methods statically identify ident...
Pipeline Spectroscopy is a new technique that allows us to measure the cost of each cache miss. The cost of a miss is displayed (graphed) as a histogram, which represents a precis...
Thomas R. Puzak, Allan Hartstein, Philip G. Emma, ...
Over the last decade, Message Passing Interface (MPI) has become a very successful parallel programming environment for distributed memory architectures such as clusters. However, ...