Aggressive compiler optimizations such as software pipelining and loop invariant code motion can significantly improve application performance, but these transformations often re...
Chris Zimmer, Stephen Roderick Hines, Prasad Kulka...
Although NAND flash memory has become one of the most popular storage media for portable devices, it has a serious problem with respect to lifetime. Each block of NAND flash memor...
Dawoon Jung, Yoon-Hee Chae, Heeseung Jo, Jinsoo Ki...
We propose a technique which leverages configurable data caches to address the problem of cache interference in multitasking embedded systems. Data caches are often necessary to p...
ABSTRACT This paper presents the first scratch-pad memory allocation scheme that requires no compiler support for interpreted-language based applications. A scratch-pad memory (SPM...
Many MPSoC applications are loop-intensive and amenable to automatic parallelization with suitable compiler support. One of the key components of any compiler-parallelized code is...
This paper presents the first automatic scheme to allocate local (stack) data in recursive functions to scratch-pad memory (SPM) in embedded systems. A scratch-pad is a fast direct...
Embedded platforms are resource-constrained systems in which performance and memory requirements of executed code are of critical importance. However, standard techniques such as ...
Carmen Badea, Alexandru Nicolau, Alexander V. Veid...
Delayed branching is a technique to alleviate branch hazards without expensive hardware branch prediction mechanisms. For VLIW processors with deep pipelines and many issue slots,...
Dynamic binary translation (DBT) has been used to achieve numerous goals (e.g., better performance) for general-purpose computers. Recently, DBT has also attracted attention for e...
Prior work on modeling interconnects has focused on optimizing the wire and repeater design for trading off energy and delay, and is largely based on low level circuit parameters....
Rahul Nagpal, Arvind Madan, Bharadwaj Amrutur, Y. ...