This paper discusses three techniques useful in relaxing the constraints imposed by control flow on parallelism: control dependence analysis, executing multiple flows of control s...
Schedulability analysis of real-time embedded systems requires worst case timing guarantees of embedded software performance. This involves not only language level program analysi...
Current processors exploit out-of-order execution and branch prediction to improve instruction level parallelism. When a branch prediction is wrong, processors flush the pipeline ...
This paper presents the Dynamic Simultaneous Multithreaded Architecture (DSMT). DSMT efficiently executes multiple threads from a single program on a SMT processor core. To accomp...
Multi-core processors, with low communication costs and high availability of execution cores, will increase the use of execution and compilation models that use short threads to e...