Sciweavers

HPCA
2003
IEEE

Exploring the VLSI Scalability of Stream Processors

15 years 24 days ago
Exploring the VLSI Scalability of Stream Processors
Stream processors are high-performance programmable processors optimized to run media applications. Recent work has shown these processors to be more area- and energy-efficient than conventional programmable architectures. This paper explores the scalability of stream architectures to future VLSI technologies where over a thousand floating-point units on a single chip will be feasible. Two techniques for increasing the number of ALUs in a stream processor are presented: intracluster and intercluster scaling. These scaling techniques are shown to be cost-efficient to tens of ALUs per cluster and to hundreds of arithmetic clusters. A 640-ALU stream processor with 128 clusters and 5 ALUs per cluster is shown to be feasible in 45 nanometer technology, sustaining over 300 GOPS on kernels and providing 15.3x of kernel speedup and 8.0x of application speedup over a 40-ALU stream processor with a 2% degradation in area per ALU and a 7% degradation in energy dissipated per ALU operation.
Brucek Khailany, William J. Dally, Scott Rixner, U
Added 01 Dec 2009
Updated 01 Dec 2009
Type Conference
Year 2003
Where HPCA
Authors Brucek Khailany, William J. Dally, Scott Rixner, Ujval J. Kapasi, John D. Owens, Brian Towles
Comments (0)