Sciweavers

183 search results - page 29 / 37
» History-based memory mode prediction for improving memory pe...
Sort
View
ICS
1999
Tsinghua U.
13 years 11 months ago
Software trace cache
—This paper explores the use of compiler optimizations which optimize the layout of instructions in memory. The target is to enable the code to make better use of the underlying ...
Alex Ramírez, Josep-Lluis Larriba-Pey, Carl...
CODES
2004
IEEE
13 years 11 months ago
A loop accelerator for low power embedded VLIW processors
The high transistor density afforded by modern VLSI processes have enabled the design of embedded processors that use clustered execution units to deliver high levels of performan...
Binu K. Mathew, Al Davis
PROCEDIA
2011
12 years 10 months ago
GPU-accelerated Chemical Similarity Assessment for Large Scale Databases
The assessment of chemical similarity between molecules is a basic operation in chemoinformatics, a computational area concerning with the manipulation of chemical structural info...
Marco Maggioni, Marco D. Santambrogio, Jie Liang
AAECC
2007
Springer
111views Algorithms» more  AAECC 2007»
13 years 7 months ago
When cache blocking of sparse matrix vector multiply works and why
Abstract. We present new performance models and a new, more compact data structure for cache blocking when applied to the sparse matrixvector multiply (SpM×V) operation, y ← y +...
Rajesh Nishtala, Richard W. Vuduc, James Demmel, K...
CGO
2010
IEEE
14 years 2 months ago
Integrated instruction selection and register allocation for compact code generation exploiting freeform mixing of 16- and 32-bi
For memory constrained embedded systems code size is at least as important as performance. One way of increasing code density is to exploit compact instruction formats, e.g. ARM T...
Tobias J. K. Edler von Koch, Igor Böhm, Bj&ou...