Sciweavers

315 search results - page 52 / 63
» On reducing load store latencies of cache accesses
Sort
View
JSA
2006
167views more  JSA 2006»
13 years 7 months ago
Pattern-driven prefetching for multimedia applications on embedded processors
Multimedia applications in general and video processing, such as the MPEG4 Visual stream decoders, in particular are increasingly popular and important workloads for future embedd...
Hassan Sbeyti, Smaïl Niar, Lieven Eeckhout
ASPLOS
2010
ACM
13 years 11 months ago
Micro-pages: increasing DRAM efficiency with locality-aware data placement
Power consumption and DRAM latencies are serious concerns in modern chip-multiprocessor (CMP or multi-core) based compute systems. The management of the DRAM row buffer can signif...
Kshitij Sudan, Niladrish Chatterjee, David Nellans...
DAC
2004
ACM
14 years 8 months ago
Multi-profile based code compression
Code compression has been shown to be an effective technique to reduce code size in memory constrained embedded systems. It has also been used as a way to increase cache hit ratio...
Eduardo Wanderley Netto, Rodolfo Azevedo, Paulo Ce...
ANCS
2009
ACM
13 years 5 months ago
Range Tries for scalable address lookup
In this paper we introduce the Range Trie, a new multiway tree data structure for address lookup. Each Range Trie node maps to an address range [Na, Nb) and performs multiple comp...
Ioannis Sourdis, Georgios Stefanakis, Ruben de Sme...
HPCA
2006
IEEE
14 years 8 months ago
Software-hardware cooperative memory disambiguation
In high-end processors, increasing the number of in-flight instructions can improve performance by overlapping useful processing with long-latency accesses to the main memory. Buf...
Ruke Huang, Alok Garg, Michael C. Huang