Sciweavers

2784 search results - page 21 / 557
» Instruction Level Parallelism
Sort
View
HICSS
1995
IEEE
109views Biometrics» more  HICSS 1995»
14 years 1 months ago
The architecture of an optimistic CPU: the WarpEngine
The architecture for a shared memory CPU is described. The CPU allows for parallelism down to the level of single instructions and is tolerant of memory latency. All executable in...
John G. Cleary, Murray Pearson, Husam Kinawi
HPCA
2005
IEEE
14 years 10 months ago
A Small, Fast and Low-Power Register File by Bit-Partitioning
A large multi-ported register file is indispensable for exploiting instruction level parallelism (ILP) in today's dynamically scheduled superscalar processors. The number of ...
Masaaki Kondo, Hiroshi Nakamura
WMPI
2004
ACM
14 years 3 months ago
Scalable cache memory design for large-scale SMT architectures
The cache hierarchy design in existing SMT and superscalar processors is optimized for latency, but not for bandwidth. The size of the L1 data cache did not scale over the past dec...
Muhamed F. Mudawar
CGO
2007
IEEE
14 years 4 months ago
Loop Optimization using Hierarchical Compilation and Kernel Decomposition
The increasing complexity of hardware features for recent processors makes high performance code generation very challenging. In particular, several optimization targets have to b...
Denis Barthou, Sébastien Donadio, Patrick C...