Sciweavers

923 search results - page 71 / 185
» Shared Memory Performance Profiling
Sort
View
IWMM
2011
Springer
270views Hardware» more  IWMM 2011»
12 years 11 months ago
Memory management in NUMA multicore systems: trapped between cache contention and interconnect overhead
Multiprocessors based on processors with multiple cores usually include a non-uniform memory architecture (NUMA); even current 2-processor systems with 8 cores exhibit non-uniform...
Zoltan Majo, Thomas R. Gross
LCPC
2007
Springer
14 years 3 months ago
Automatic Communication Performance Debugging in PGAS Languages
Recent studies have shown that programming in a Partition Global Address Space (PGAS) language can be more productive than programming in a message passing model. One reason for th...
Jimmy Su, Katherine A. Yelick
SIGMOD
2010
ACM
205views Database» more  SIGMOD 2010»
14 years 1 months ago
Performing sound flash device measurements: some lessons from uFLIP
It is amazingly easy to get meaningless results when measuring flash devices, partly because of the peculiarity of flash memory, but primarily because their behavior is determin...
Matias Bjørling, Lionel Le Folgoc, Ahmed Ms...
ICCD
2006
IEEE
115views Hardware» more  ICCD 2006»
14 years 5 months ago
Microarchitecture and Performance Analysis of Godson-2 SMT Processor
—This paper introduces the microarchitecture and logical implementation of SMT (Simultaneous Multithreading) improvement of Godson-2 processor which is a 64-bit, four-issue, out-...
Zusong Li, Xianchao Xu, Weiwu Hu, Zhimin Tang
SC
2004
ACM
14 years 2 months ago
Predicting and Evaluating Distributed Communication Performance
–Application of hardware-parameterized models to distributed systems can result in omission of key bottlenecks such as the full cost of inter- and intra-node communication in a c...
Kirk W. Cameron, Rong Ge