Multimedia and network processing applications make extensive use of subword data. Since registers are capable of holding a full data word, when a subword variable is assigned a r...
Predicated execution can eliminate hard to predict branches and help to enable instruction level parallelism. Many current predication variants exist where the result update is co...
—Super-scalar, out-of-order processors that can have tens of read and write requests in the execution window place significant demands on Memory Level Parallelism (MLP). Multi- ...
George C. Caragea, Alexandros Tzannes, Fuat Keceli...
While a number of different shading languages have been developed, their efficient integration into an existing renderer is notoriously difficult, often boiling down to implementi...
Ralf Karrenberg, Dmitri Rubinstein, Philipp Slusal...
Incorrect thread synchronization often leads to concurrency bugs that manifest nondeterministically and are difficult to detect and fix. Past work on detecting concurrency bugs ...