Sciweavers

955 search results - page 8 / 191
» Performance optimization of multiple memory architectures fo...
Sort
View
132
Voted
TPDS
2010
144views more  TPDS 2010»
15 years 27 days ago
Performance Evaluation of Dynamic Speculative Multithreading with the Cascadia Architecture
—Thread-level parallelism (TLP) has been extensively studied in order to overcome the limitations of exploiting instruction-level parallelism (ILP) on high-performance superscala...
David A. Zier, Ben Lee
117
Voted
ICPP
2003
IEEE
15 years 7 months ago
Procedural Level Address Offset Assignment of DSP Applications with Loops
Automatic optimization of address offset assignment for DSP applications, which reduces the number of address arithmetic instructions to meet the tight memory size restrictions an...
Youtao Zhang, Jun Yang 0002
125
Voted
IPPS
1996
IEEE
15 years 6 months ago
A Method for Register Allocation to Loops in Multiple Register File Architectures
Multiple instruction issue processors place high demands on register file bandwidth. One solution to reduce this bottleneck is the use of multiple register files. Register allocat...
David J. Kolson, Alexandru Nicolau, Nikil D. Dutt,...
152
Voted
ICPP
2009
IEEE
15 years 9 months ago
Perfomance Models for Blocked Sparse Matrix-Vector Multiplication Kernels
—Sparse Matrix-Vector multiplication (SpMV) is a very challenging computational kernel, since its performance depends greatly on both the input matrix and the underlying architec...
Vasileios Karakasis, Georgios I. Goumas, Nectarios...
MICRO
2002
IEEE
173views Hardware» more  MICRO 2002»
15 years 7 months ago
Vector vs. superscalar and VLIW architectures for embedded multimedia benchmarks
Multimedia processing on embedded devices requires an architecture that leads to high performance, low power consumption, reduced design complexity, and small code size. In this p...
Christoforos E. Kozyrakis, David A. Patterson