Clustered ILP processors are characterized by a large number of non-centralized on-chip resources grouped into clusters. Traditional code generation schemes for these processors c...
Krishnan Kailas, Kemal Ebcioglu, Ashok K. Agrawala
Caches are very inefficiently utilized because not all the excess data fetched into the cache, to exploit spatial locality, is utilized. We define cache utilization as the percent...
This paper describes a hardware-software co-design approach for flexible programmable Galois Field Processing for applications which require operations over GF(2m ), such as RS an...
In a CMOS combinational logic circuit, the subthreshold leakage current in the standby state depends on the state of the inputs. In this paper we present a new approach to identif...
We present a cross-layer customization methodology for latency and bandwidth efficient inter-core communication in embedded multiprocessors. The methodology integrates compiler, o...