This paper presents a middleware capable of out-of-order execution of kernels and data transfers for efficient stream processing in the compute unified device architecture (CUDA). ...
Out-of-order execution significantly increases the performance of superscalar processors. The out-of-order execution mechanism is, however, energy-inefficient, which inhibits scal...
Hans Vandierendonck, Philippe Manet, Thibault Dela...
We present a two-part approach for verifying out-of-order execution. First, the complexity of out-of-order issue and scheduling is handled by creating der abstraction of the out-of...
Several studies have demonstrated that out-of-order execution processors may not be the most adequate organization for wide issue processors due to the increasing penalties that w...
This paper describes the DELFT-JAVA processor and the mechanisms required to dynamically translate JVM instructions into DELFT-JAVA instructions. Using a form of hardware register...