Data prefetching effectively reduces the negative effects of long load latencies on the performance of modern processors. Hardware prefetchers employ hardware structures to predic...
Jamison D. Collins, Suleyman Sair, Brad Calder, De...
As the "system-on-a-chip" concept is rapidly becoming a reality, time-to-market and product complexity push the reuse of complex macromodules. Circuits combining a varie...
In this article, we present a parallel implementation of a 1024 point Fast Fourier Transform (FFT) operating with a subthreshold supply voltage, which is below the voltage that tur...
The designs of high-performance processor architectures are moving toward the integration of a large number of multiple processing cores on a single chip. The IBM Cyclops-64 (C64)...
Various architectural power reduction techniques have been proposed for on-chip caches in the last decade. In this paper, we first show that these power reduction techniques can b...
Ja Chun Ku, Serkan Ozdemir, Gokhan Memik, Yehea I....