In this paper we study the performance improvements and trade-offs derived from an optimized mapping approach applied on a parametric coarse grained reconfigurable array architect...
Grigoris Dimitroulakos, Michalis D. Galanis, Const...
The inherent instruction-level parallelism (ILP) of current applications (specially those based on floating point computations) has driven hardware designers and compilers writers...
Many commercially available embedded processors are capable of extending their base instruction set for a specific domain of applications. While steady progress has been made in t...
Clustering is an effective method to increase the available parallelism in VLIW datapaths without incurring severe penalties associated with large number of register file ports. E...
Viktor S. Lapinskii, Margarida F. Jacome, Gustavo ...
The poor scalability of existing superscalar processors has been of great concern to the computer engineering community. In particular, the critical-path lengths of many components...
Bradley C. Kuszmaul, Dana S. Henry, Gabriel H. Loh