After many years of prefetching research, most commercially available systems support only two types of prefetching: software-directed prefetching and hardware-based prefetchers u...
The Asynchronous Polycyclic Architecture (APA) is a new processor design for numerically intensive applications. APA resembles the VLIW architecture, in that it provides independen...
Loops are the main time consuming part of programs based on floating point computations. The performance of the loops is limited either by recurrences in the computation or by the...
Most commercial and academic floating point libraries for FPGAs provide only a small fraction of all possible floating point units. In contrast, the floating point unit generat...
Field programmable gate arrays (FPGAs), graphics processing units (GPUs) and Sony’s Playstation 2 vector units offer scope for hardware acceleration of applications. Implementin...
Lee W. Howes, Paul Price, Oskar Mencer, Olav Beckm...