Abstract. High-performance design flows for FPGAs often rely on module generators to counter coarse logic-block granularity and limited routing resources, However, the very flexi...
Recent studies have shown that in highly associative caches, the performance gap between the Least Recently Used (LRU) and the theoretical optimal replacement algorithms is large,...
In high-level synthesis, pipelined designs are often restricted by the number of memory banks available to the synthesis system. Using multiple memory banks can improve the perfor...
This paper presents a novel architectural technique to hide fetch latency overhead of hardware encrypted and authenticated memory. A number of recent secure processor designs have...
Abstract--Polynomial approximation is a very general technique for the evaluation of a wide class of numerical functions of one variable. This article details an architecture gener...