In this paper we present a processor microarchitecture that can simultaneously execute multiple threads and has a clustered design for scalability purposes. A main feature of the ...
Automatic parallelization of general-purpose programs is still not possible in general in the presence of irregular data structures and complex control-flows. One promising strate...
In this paper, we evaluate a mechanism to reissue instructions on the mispredicted speculation path. An instruction which is once dispatched to a functional unit during mispredict...
Concurrent multithreaded architectures exploit both instruction-level and thread-level parallelism through a combination of branch prediction and thread-level control speculation. ...
Run-time parallelization is often the only way to execute the code in parallel when data dependence information is incomplete at compile time. This situation is common in many imp...