Sciweavers

140 search results - page 14 / 28
» On Reducing False Sharing while Improving Locality on Shared...
Sort
View
ASPLOS
2012
ACM
12 years 3 months ago
Aikido: accelerating shared data dynamic analyses
Despite a burgeoning demand for parallel programs, the tools available to developers working on shared-memory multicore processors have lagged behind. One reason for this is the l...
Marek Olszewski, Qin Zhao, David Koh, Jason Ansel,...
IWMM
2011
Springer
270views Hardware» more  IWMM 2011»
12 years 10 months ago
Memory management in NUMA multicore systems: trapped between cache contention and interconnect overhead
Multiprocessors based on processors with multiple cores usually include a non-uniform memory architecture (NUMA); even current 2-processor systems with 8 cores exhibit non-uniform...
Zoltan Majo, Thomas R. Gross
IPPS
2003
IEEE
14 years 22 days ago
Miss Penalty Reduction Using Bundled Capacity Prefetching in Multiprocessors
While prefetch has proven itself useful for reducing cache misses in multiprocessors, traffic is often increased due to extra unused prefetch data. Prefetching in multiprocessors...
Dan Wallin, Erik Hagersten
ARVLSI
1997
IEEE
151views VLSI» more  ARVLSI 1997»
13 years 11 months ago
The Hierarchical Multi-Bank DRAM: A High-Performance Architecture for Memory Integrated with Processors
A microprocessor integrated with DRAM on the same die has the potential to improve system performance by reducing the memory latency and improving the memory bandwidth. However, a...
Tadaaki Yamauchi, Lance Hammond, Kunle Olukotun
ICS
2005
Tsinghua U.
14 years 29 days ago
A NUCA substrate for flexible CMP cache sharing
We propose an organization for the on-chip memory system of a chip multiprocessor, in which 16 processors share a 16MB pool of 256 L2 cache banks. The L2 cache is organized as a n...
Jaehyuk Huh, Changkyu Kim, Hazim Shafi, Lixin Zhan...