This paper describes a compiler for stream programs that efficiently schedules computational kernels and stream memory operations, and allocates on-chip storage. Our compiler uses...
Testing and diagnosis are important issues in system-onchip (SOC) development, as more and more embedded cores are being integrated into the chips. In this paper we propose a buil...
The design of high-throughput large-state Viterbi decoders relies on the use of multiple arithmetic units. The global communication channels among these parallel processors often ...
In modern superscalar processors, the complex instruction scheduler could form the critical path of the pipeline stages and limit the clock cycle time. In addition, complex schedu...
DTA (Decoupled Threaded Architecture) is designed to exploit fine/medium grained Thread Level Parallelism (TLP) by using a distributed hardware scheduling unit and relying on exi...