Abstract. Multimedia and network processing applications make extensive use of subword data. Since registers are capable of holding a full data word, when a subword variable is ass...
Existing SIMD extensions cannot efficiently vectorize the histogram function due to memory collisions. We propose two techniques to avoid this problem. In the first, a hierarchi...
Asadollah Shahbahrami, Ben H. H. Juurlink, Stamati...
We revisit memory hierarchy design viewing memory as an inter-operation communication agent. This perspective leads to the development of novel methods of performing inter-operati...
Speed improvements in today's processors have largely been delivered in the form of multiple cores, increasing the importance of ions that ease parallel programming. Software...
Branches that depend directly or indirectly on load instructions are a leading cause of mispredictions by state-of-the-art branch predictors. For a branch of this type, there is a...