Sciweavers

276 search results - page 46 / 56
» Memory Organization for Improved Data Cache Performance in E...
Sort
View
CASES
2007
ACM
14 years 13 days ago
An integrated ARM and multi-core DSP simulator
In this paper we describe the design and implementation of a flexible, and extensible, just-in-time ARM simulator designed to run co-operatively with a multi-core DSP simulator on...
Sharad Singhai, MingYung Ko, Sanjay Jinturkar, May...
CASES
2006
ACM
14 years 2 months ago
Supporting precise garbage collection in Java Bytecode-to-C ahead-of-time compiler for embedded systems
A Java bytecode-to-C ahead-of-time compiler (AOTC) can improve the performance of a Java virtual machine (JVM) by translating bytecode into C code, which is then compiled into mac...
Dong-Heon Jung, Sung-Hwan Bae, Jaemok Lee, Soo-Moo...
LCTRTS
2007
Springer
14 years 2 months ago
Integrated CPU and l2 cache voltage scaling using machine learning
Embedded systems serve an emerging and diverse set of applications. As a result, more computational and storage capabilities are added to accommodate ever more demanding applicati...
Nevine AbouGhazaleh, Alexandre Ferreira, Cosmin Ru...
ICS
2003
Tsinghua U.
14 years 1 months ago
Inferential queueing and speculative push for reducing critical communication latencies
Communication latencies within critical sections constitute a major bottleneck in some classes of emerging parallel workloads. In this paper, we argue for the use of Inferentially...
Ravi Rajwar, Alain Kägi, James R. Goodman
IPPS
2007
IEEE
14 years 2 months ago
Optimizing Inter-Nest Data Locality Using Loop Splitting and Reordering
With the increasing gap between processor speed and memory latency, the performance of data-dominated programs are becoming more reliant on fast data access, which can be improved...
Sofiane Naci