Sciweavers

41 search results - page 7 / 9
» Access Region Locality for High-Bandwidth Processor Memory S...
Sort
View
ARCS
2009
Springer
14 years 1 months ago
Improving Memory Subsystem Performance Using ViVA: Virtual Vector Architecture
The disparity between microprocessor clock frequencies and memory latency is a primary reason why many demanding applications run well below peak achievable performance. Software c...
Joseph Gebis, Leonid Oliker, John Shalf, Samuel Wi...
ASPLOS
2010
ACM
13 years 10 months ago
Micro-pages: increasing DRAM efficiency with locality-aware data placement
Power consumption and DRAM latencies are serious concerns in modern chip-multiprocessor (CMP or multi-core) based compute systems. The management of the DRAM row buffer can signif...
Kshitij Sudan, Niladrish Chatterjee, David Nellans...
ISLPED
2005
ACM
96views Hardware» more  ISLPED 2005»
14 years 16 days ago
Region-level approximate computation reuse for power reduction in multimedia applications
ABSTRACT Motivated by data value locality and quality tolerance present in multimedia applications, we propose a new micro-architecture, Region-level Approximate Computation Buffer...
Xueqi Cheng, Michael S. Hsiao
ASAP
2008
IEEE
120views Hardware» more  ASAP 2008»
13 years 9 months ago
Lightweight DMA management mechanisms for multiprocessors on FPGA
This paper presents a multiprocessor system on FPGA that adopts Direct Memory Access (DMA) mechanisms to move data between the external memory and the local memory of each process...
Antonino Tumeo, Matteo Monchiero, Gianluca Palermo...
IWOMP
2007
Springer
14 years 1 months ago
Supporting OpenMP on Cell
The Cell processor is a heterogeneous multi-core processor with one Power Processing Engine (PPE) core and eight Synergistic Processing Engine (SPE) cores. Each SPE has a directly...
Kevin O'Brien, Kathryn M. O'Brien, Zehra Sura, Ton...