This paper explores an application-specific customization technique for the data cache, one of the foremost area/power consuming and performance determining microarchitectural fea...
This paper introduces a flexible code generation framework dedicated to the design of application specific programmable processors. This tool allows the user to build specific com...
Clustered ILP processors are characterized by a large number of non-centralized on-chip resources grouped into clusters. Traditional code generation schemes for these processors c...
Krishnan Kailas, Kemal Ebcioglu, Ashok K. Agrawala
Abstract--Loop tiling is an important compiler transformation used for enhancing data locality and exploiting coarsegrained parallelism. Tiled codes in which tile sizes are runtime...
Albert Hartono, Muthu Manikandan Baskaran, J. Rama...
A well-known challenge during processor design is to obtain the best possible results for a typical target application domain that is generally described as a set of benchmarks. O...