We present preliminary results of a project to create a tuning system that adaptively optimizes programs to the underlying execution platform. We will show initial results from tw...
Abstract. This paper presents the design, implementation and experimental evaluation of a practical region-based partial dead code elimination (PDE) algorithm on predicated code in...
Datapath width optimization is very effective for reducing the area of a custom-made embedded system. The trivial way of optimization is to iteratively customize, evaluate, and r...
Designing Application-Specific Instruction-set Processors (ASIPs) usually requires designing a custom datapath, and modifying instruction-set, instruction decoder, and compiler. A...
The memory access limits the performance of stream processors. By exploiting the reuse of data held in the Stream Register File (SRF), an on-chip storage, the number of memory acc...
Xuejun Yang, Ying Zhang, Jingling Xue, Ian Rogers,...