Sciweavers

501 search results - page 41 / 101
» Evaluating CMPs and Their Memory Architecture
Sort
View
HPCA
1997
IEEE
13 years 12 months ago
Global Address Space, Non-Uniform Bandwidth: A Memory System Performance Characterization of Parallel Systems
Many parallel systems offer a simple view of memory: all storage cells are addresseduniformly. Despite a uniform view of the memory, the machines differsignificantly in theirmemo...
Thomas Stricker, Thomas R. Gross
HPCA
2004
IEEE
14 years 8 months ago
Signature Buffer: Bridging Performance Gap between Registers and Caches
Data communications between producer instructions and consumer instructions through memory incur extra delays that degrade processor performance. In this paper, we introduce a new...
Lu Peng, Jih-Kwon Peir, Konrad Lai
WOSP
2004
ACM
14 years 1 months ago
Collecting whole-system reference traces of multiprogrammed and multithreaded workloads
The simulated evaluation of memory management policies relies on reference traces—logs of memory operations performed by running processes. No existing approach to reference tra...
Scott F. Kaplan
IISWC
2008
IEEE
14 years 2 months ago
Accelerating multi-core processor design space evaluation using automatic multi-threaded workload synthesis
The design and evaluation of microprocessor architectures is a difficult and time-consuming task. Although small, handcoded microbenchmarks can be used to accelerate performance e...
Clay Hughes, Tao Li
ACMMSP
2004
ACM
92views Hardware» more  ACMMSP 2004»
14 years 1 months ago
Instruction combining for coalescing memory accesses using global code motion
Instruction combining is an optimization to replace a sequence of instructions with a more efficient instruction yielding the same result in a fewer machine cycles. When we use it...
Motohiro Kawahito, Hideaki Komatsu, Toshio Nakatan...